2022-11-23T01:10:21.9639248Z Requested labels: linux.8xlarge.nvidia.gpu 2022-11-23T01:10:21.9639343Z Job defined at: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/master 2022-11-23T01:10:21.9639366Z Waiting for a runner to pick up this job... 2022-11-23T01:10:22.3863891Z Job is about to start running on the runner: i-088dc030290e38a53 (organization) 2022-11-23T01:10:27.2398022Z Current runner version: '2.299.1' 2022-11-23T01:10:27.2405856Z Runner name: 'i-088dc030290e38a53' 2022-11-23T01:10:27.2406585Z Runner group name: 'Default' 2022-11-23T01:10:27.2407316Z Machine name: 'ip-10-0-2-109' 2022-11-23T01:10:27.2410183Z ##[group]GITHUB_TOKEN Permissions 2022-11-23T01:10:27.2411163Z Actions: write 2022-11-23T01:10:27.2411684Z Checks: write 2022-11-23T01:10:27.2412069Z Contents: write 2022-11-23T01:10:27.2412511Z Deployments: write 2022-11-23T01:10:27.2412963Z Discussions: write 2022-11-23T01:10:27.2413348Z Issues: write 2022-11-23T01:10:27.2413870Z Metadata: read 2022-11-23T01:10:27.2414306Z Packages: write 2022-11-23T01:10:27.2414747Z Pages: write 2022-11-23T01:10:27.2415140Z PullRequests: write 2022-11-23T01:10:27.2415641Z RepositoryProjects: write 2022-11-23T01:10:27.2416166Z SecurityEvents: write 2022-11-23T01:10:27.2416569Z Statuses: write 2022-11-23T01:10:27.2416999Z ##[endgroup] 2022-11-23T01:10:27.2421257Z Secret source: Actions 2022-11-23T01:10:27.2422286Z Prepare workflow directory 2022-11-23T01:10:27.6177653Z Prepare all required actions 2022-11-23T01:10:27.6417611Z Getting action download info 2022-11-23T01:10:27.8379175Z Download action repository 'pytorch/test-infra@main' (SHA:c57ff4d9a93667a5571a80a0e92c3e2674aeedfd) 2022-11-23T01:10:28.1465492Z Download action repository 'pytorch/pytorch@master' (SHA:1cfd3858ac54fe3883534309081631a0a892ba3f) 2022-11-23T01:10:31.3258550Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2022-11-23T01:10:31.6055224Z Getting action download info 2022-11-23T01:10:31.7832421Z Download action repository 'malfet/checkout@silent-checkout' (SHA:c7b8fef48edfe1bca0044a44b1f7f7c4318a3076) 2022-11-23T01:10:31.9684856Z Getting action download info 2022-11-23T01:10:32.1420840Z Download action repository 'nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767' (SHA:7d4a37704547a311dbb66ebdf5b23ec19374a767) 2022-11-23T01:10:32.4297473Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml 2022-11-23T01:10:32.4299957Z ##[group] Inputs 2022-11-23T01:10:32.4300352Z build-environment: linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T01:10:32.4301968Z test-matrix: { include: [ { config: "default", shard: 1, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 2, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 3, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 4, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "functorch", shard: 1, num_shards: 1, runner: "linux.4xlarge.nvidia.gpu" }, { config: "slow", shard: 1, num_shards: 2, runner: "linux.4xlarge.nvidia.gpu" }, { config: "slow", shard: 2, num_shards: 2, runner: "linux.4xlarge.nvidia.gpu" }, { config: "nogpu_AVX512", shard: 1, num_shards: 1, runner: "linux.2xlarge" }, { config: "nogpu_NO_AVX2", shard: 1, num_shards: 1, runner: "linux.2xlarge" }, { config: "jit_legacy", shard: 1, num_shards: 1, runner: "linux.4xlarge.nvidia.gpu" }, { config: "distributed", shard: 1, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "distributed", shard: 2, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "distributed", shard: 3, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, ]} 2022-11-23T01:10:32.4303828Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:10:32.4304320Z sync-tag: 2022-11-23T01:10:32.4305417Z timeout-minutes: 240 2022-11-23T01:10:32.4305719Z ##[endgroup] 2022-11-23T01:10:32.4306566Z Complete job name: linux-bionic-cuda11.7-py3.10-gcc7 / test (distributed, 3, 3, linux.8xlarge.nvidia.gpu) 2022-11-23T01:10:32.5437195Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2022-11-23T01:10:32.5437603Z with: 2022-11-23T01:10:32.5438199Z github-secret: *** 2022-11-23T01:10:32.5438491Z activate-with-label: false 2022-11-23T01:10:32.5438768Z label: with-ssh 2022-11-23T01:10:32.5439198Z remove-existing-keys: true 2022-11-23T01:10:32.5439452Z env: 2022-11-23T01:10:32.5439700Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:32.5439968Z ##[endgroup] 2022-11-23T01:10:32.6520159Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2022-11-23T01:10:32.6759132Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@master 2022-11-23T01:10:32.6759502Z with: 2022-11-23T01:10:32.6759747Z submodules: recursive 2022-11-23T01:10:32.6760010Z fetch-depth: 0 2022-11-23T01:10:32.6760229Z env: 2022-11-23T01:10:32.6760491Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:32.6760802Z ##[endgroup] 2022-11-23T01:10:32.7059542Z ##[group]Run retry () { 2022-11-23T01:10:32.7059888Z retry () { 2022-11-23T01:10:32.7060191Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2022-11-23T01:10:32.7060501Z } 2022-11-23T01:10:32.7060767Z echo "${GITHUB_WORKSPACE}" 2022-11-23T01:10:32.7061093Z if [ -z "${NO_SUDO}" ]; then 2022-11-23T01:10:32.7061398Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2022-11-23T01:10:32.7061688Z else 2022-11-23T01:10:32.7061970Z  retry rm -rf "${GITHUB_WORKSPACE}" 2022-11-23T01:10:32.7062229Z fi 2022-11-23T01:10:32.7062535Z mkdir "${GITHUB_WORKSPACE}" 2022-11-23T01:10:32.7082480Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:10:32.7082802Z env: 2022-11-23T01:10:32.7083064Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:32.7083332Z NO_SUDO: 2022-11-23T01:10:32.7083560Z ##[endgroup] 2022-11-23T01:10:32.7214671Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:10:35.7277590Z ##[group]Run malfet/checkout@silent-checkout 2022-11-23T01:10:35.7278063Z with: 2022-11-23T01:10:35.7278438Z ref: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:10:35.7278814Z fetch-depth: 0 2022-11-23T01:10:35.7279169Z submodules: recursive 2022-11-23T01:10:35.7279536Z quiet-checkout: true 2022-11-23T01:10:35.7279920Z repository: pytorch/pytorch 2022-11-23T01:10:35.7280522Z token: *** 2022-11-23T01:10:35.7280865Z ssh-strict: true 2022-11-23T01:10:35.7281226Z persist-credentials: true 2022-11-23T01:10:35.7281592Z clean: true 2022-11-23T01:10:35.7281927Z lfs: false 2022-11-23T01:10:35.7282262Z set-safe-directory: true 2022-11-23T01:10:35.7282616Z env: 2022-11-23T01:10:35.7282954Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:35.7283298Z ##[endgroup] 2022-11-23T01:10:35.8823079Z Syncing repository: pytorch/pytorch 2022-11-23T01:10:35.8824963Z ##[group]Getting Git version info 2022-11-23T01:10:35.8825521Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-11-23T01:10:35.8826141Z [command]/usr/bin/git version 2022-11-23T01:10:35.8826401Z git version 2.37.1 2022-11-23T01:10:35.8837010Z ##[endgroup] 2022-11-23T01:10:35.8858238Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/8a0adf32-21e8-406d-a204-4d4f79d67913' before making global git config changes 2022-11-23T01:10:35.8859199Z Adding repository directory to the temporary git global config as a safe directory 2022-11-23T01:10:35.8865317Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:10:35.8910556Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-11-23T01:10:35.8918222Z ##[group]Initializing the repository 2022-11-23T01:10:35.8922066Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:10:35.8954623Z hint: Using 'master' as the name for the initial branch. This default branch name 2022-11-23T01:10:35.8955415Z hint: is subject to change. To configure the initial branch name to use in all 2022-11-23T01:10:35.8955876Z hint: of your new repositories, which will suppress this warning, call: 2022-11-23T01:10:35.8956205Z hint: 2022-11-23T01:10:35.8956593Z hint: git config --global init.defaultBranch 2022-11-23T01:10:35.8957333Z hint: 2022-11-23T01:10:35.8958203Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2022-11-23T01:10:35.8959193Z hint: 'development'. The just-created branch can be renamed via this command: 2022-11-23T01:10:35.8959639Z hint: 2022-11-23T01:10:35.8960064Z hint: git branch -m 2022-11-23T01:10:35.8960607Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2022-11-23T01:10:35.8970055Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2022-11-23T01:10:35.9006226Z ##[endgroup] 2022-11-23T01:10:35.9006728Z ##[group]Disabling automatic garbage collection 2022-11-23T01:10:35.9010885Z [command]/usr/bin/git config --local gc.auto 0 2022-11-23T01:10:35.9044165Z ##[endgroup] 2022-11-23T01:10:35.9045159Z ##[group]Setting up auth 2022-11-23T01:10:35.9053883Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-11-23T01:10:35.9090212Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-11-23T01:10:35.9397787Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-11-23T01:10:35.9431754Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-11-23T01:10:35.9738334Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-11-23T01:10:35.9788984Z ##[endgroup] 2022-11-23T01:10:35.9789461Z ##[group]Fetching the repository 2022-11-23T01:10:35.9798420Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --quiet --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2022-11-23T01:11:30.6055767Z [command]/usr/bin/git rev-parse --verify --quiet 1cfd3858ac54fe3883534309081631a0a892ba3f^{object} 2022-11-23T01:11:30.6086875Z 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:11:30.6092963Z ##[endgroup] 2022-11-23T01:11:30.6093446Z ##[group]Determining the checkout info 2022-11-23T01:11:30.6094676Z ##[endgroup] 2022-11-23T01:11:30.6095124Z ##[group]Checking out the ref 2022-11-23T01:11:30.6099665Z [command]/usr/bin/git checkout --quiet --force 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:11:32.3618390Z ##[endgroup] 2022-11-23T01:11:32.3619288Z ##[group]Setting up auth for fetching submodules 2022-11-23T01:11:32.3625203Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-11-23T01:11:32.3686462Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2022-11-23T01:11:32.3719711Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2022-11-23T01:11:32.3752463Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2022-11-23T01:11:32.3784668Z ##[endgroup] 2022-11-23T01:11:32.3785152Z ##[group]Fetching submodules 2022-11-23T01:11:32.3790276Z [command]/usr/bin/git submodule sync --recursive 2022-11-23T01:11:32.4117405Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2022-11-23T01:11:32.4433379Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2022-11-23T01:11:32.4437653Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2022-11-23T01:11:32.4442725Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2022-11-23T01:11:32.4447466Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2022-11-23T01:11:32.4452377Z Submodule 'third_party/QNNPACK' (https://github.com/pytorch/QNNPACK) registered for path 'third_party/QNNPACK' 2022-11-23T01:11:32.4457840Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2022-11-23T01:11:32.4463085Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2022-11-23T01:11:32.4468364Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2022-11-23T01:11:32.4473823Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2022-11-23T01:11:32.4479881Z Submodule 'third_party/cub' (https://github.com/NVlabs/cub.git) registered for path 'third_party/cub' 2022-11-23T01:11:32.4485790Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2022-11-23T01:11:32.4491529Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2022-11-23T01:11:32.4497615Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2022-11-23T01:11:32.4503753Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2022-11-23T01:11:32.4509992Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2022-11-23T01:11:32.4516562Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2022-11-23T01:11:32.4523102Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2022-11-23T01:11:32.4529836Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:11:32.4536655Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2022-11-23T01:11:32.4543796Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2022-11-23T01:11:32.4550884Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2022-11-23T01:11:32.4558961Z Submodule 'third_party/ios-cmake' (https://github.com/Yangqing/ios-cmake.git) registered for path 'third_party/ios-cmake' 2022-11-23T01:11:32.4566202Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2022-11-23T01:11:32.4573419Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2022-11-23T01:11:32.4581025Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2022-11-23T01:11:32.4588638Z Submodule 'third_party/neon2sse' (https://github.com/intel/ARM_NEON_2_x86_SSE.git) registered for path 'third_party/neon2sse' 2022-11-23T01:11:32.4596756Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2022-11-23T01:11:32.4604602Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2022-11-23T01:11:32.4612785Z Submodule 'third_party/onnx-tensorrt' (https://github.com/onnx/onnx-tensorrt) registered for path 'third_party/onnx-tensorrt' 2022-11-23T01:11:32.4620881Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2022-11-23T01:11:32.4629289Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2022-11-23T01:11:32.4638140Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2022-11-23T01:11:32.4647048Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2022-11-23T01:11:32.4655699Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2022-11-23T01:11:32.4664812Z Submodule 'third_party/python-enum' (https://github.com/PeachPy/enum34.git) registered for path 'third_party/python-enum' 2022-11-23T01:11:32.4673708Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2022-11-23T01:11:32.4683267Z Submodule 'third_party/python-six' (https://github.com/benjaminp/six.git) registered for path 'third_party/python-six' 2022-11-23T01:11:32.4692425Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2022-11-23T01:11:32.4701875Z Submodule 'third_party/tbb' (https://github.com/01org/tbb) registered for path 'third_party/tbb' 2022-11-23T01:11:32.4711595Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2022-11-23T01:11:32.4721983Z Submodule 'third_party/zstd' (https://github.com/facebook/zstd.git) registered for path 'third_party/zstd' 2022-11-23T01:11:32.4750511Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2022-11-23T01:11:32.7444792Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2022-11-23T01:11:32.9618597Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2022-11-23T01:11:33.2182204Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2022-11-23T01:11:33.5330568Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/QNNPACK'... 2022-11-23T01:11:33.8279790Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2022-11-23T01:11:35.8541346Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2022-11-23T01:11:41.4493281Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2022-11-23T01:11:41.8474636Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2022-11-23T01:11:42.4263940Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cub'... 2022-11-23T01:11:44.0173332Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2022-11-23T01:11:45.3686229Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2022-11-23T01:11:47.0258336Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2022-11-23T01:11:53.8361962Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2022-11-23T01:11:54.5843476Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2022-11-23T01:11:56.1163639Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2022-11-23T01:11:57.2076583Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2022-11-23T01:11:57.4321110Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2022-11-23T01:11:57.9308661Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2022-11-23T01:11:58.4395693Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2022-11-23T01:11:59.4309837Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2022-11-23T01:11:59.8818532Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ios-cmake'... 2022-11-23T01:12:00.0849098Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2022-11-23T01:12:00.3592255Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2022-11-23T01:12:02.0003353Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2022-11-23T01:12:02.5095171Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/neon2sse'... 2022-11-23T01:12:02.9491396Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2022-11-23T01:12:09.1787168Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2022-11-23T01:12:10.7895833Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt'... 2022-11-23T01:12:11.2296843Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2022-11-23T01:12:11.4810334Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2022-11-23T01:12:17.5814532Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2022-11-23T01:12:17.7731961Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2022-11-23T01:12:17.9950976Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2022-11-23T01:12:18.8553972Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-enum'... 2022-11-23T01:12:19.0815870Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2022-11-23T01:12:19.4191555Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-six'... 2022-11-23T01:12:19.7256560Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2022-11-23T01:12:20.3009292Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tbb'... 2022-11-23T01:12:22.8553068Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2022-11-23T01:12:23.3546243Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/zstd'... 2022-11-23T01:12:25.6866374Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2022-11-23T01:12:25.6993197Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2022-11-23T01:12:25.7091056Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2022-11-23T01:12:25.7376026Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2022-11-23T01:12:25.7648186Z Submodule path 'third_party/QNNPACK': checked out '7d2a4e9931a82adc3814275b6219a03e24e36b4c' 2022-11-23T01:12:25.8101022Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2022-11-23T01:12:26.5759176Z Submodule path 'third_party/XNNPACK': checked out 'ae108ef49aa5623b896fc93d4298c49d1750d9ba' 2022-11-23T01:12:26.6014501Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-11-23T01:12:26.7243011Z Submodule path 'third_party/cpuinfo': checked out '8ec7bd91ad0470e61cf38f618cc1f270dede599c' 2022-11-23T01:12:26.7644987Z Submodule path 'third_party/cub': checked out 'd106ddb991a56c3df1b6d51b2409e36ba8181ce4' 2022-11-23T01:12:27.1253066Z Submodule path 'third_party/cudnn_frontend': checked out '171a7a986f7fbd9ed71bd0cf3c7ad4f55843d6b3' 2022-11-23T01:12:27.6493179Z Submodule path 'third_party/cutlass': checked out 'b72cbf957df8cf84a6d0ff91c190ad51a9c1d24a' 2022-11-23T01:12:27.9485834Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2022-11-23T01:12:28.0062002Z Submodule path 'third_party/fbgemm': checked out '4d1738b3142a6cb0c032cd639e239566010b054a' 2022-11-23T01:12:28.0082042Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:12:28.0085111Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:12:28.0088088Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:12:28.0091534Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:12:28.0119019Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2022-11-23T01:12:28.9353909Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2022-11-23T01:12:29.5116450Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2022-11-23T01:12:30.5001080Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2022-11-23T01:12:30.8316512Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2022-11-23T01:12:30.9571688Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2022-11-23T01:12:31.0286741Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2022-11-23T01:12:31.0401604Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '1840658c184f3eeba787dae0f06c45756c1daaf5' 2022-11-23T01:12:31.1569056Z Submodule path 'third_party/flatbuffers': checked out 'd0cede9c90c5257537c293517a21376408b549fa' 2022-11-23T01:12:31.1992879Z Submodule path 'third_party/fmt': checked out '7bdf0628b1276379886c7f6dda2cef2b3b374f0b' 2022-11-23T01:12:31.2093319Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2022-11-23T01:12:31.2563612Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2022-11-23T01:12:31.2845678Z Submodule path 'third_party/gloo': checked out '4a5e339b764261d20fc409071dc7a8b8989aa195' 2022-11-23T01:12:31.3400986Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2022-11-23T01:12:31.3533295Z Submodule path 'third_party/ideep': checked out '5ddc65efe0428bbce2942b3ce5e3ce15239abe2f' 2022-11-23T01:12:31.3549491Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2022-11-23T01:12:31.3575917Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2022-11-23T01:12:39.9167034Z Submodule path 'third_party/ideep/mkl-dnn': checked out 'd19d0f795c60695bd32f894c6f01771b2dfbe24d' 2022-11-23T01:12:39.9187071Z Submodule 'third_party/oneDNN' (https://github.com/oneapi-src/oneDNN.git) registered for path 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:12:39.9214317Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN'... 2022-11-23T01:12:48.9089130Z Submodule path 'third_party/ideep/mkl-dnn/third_party/oneDNN': checked out '650085b2f3643aad05c629425983491d63b5c289' 2022-11-23T01:12:48.9206806Z Submodule path 'third_party/ios-cmake': checked out '8abaed637d56f1337d6e1d2c4026e25c1eade724' 2022-11-23T01:12:48.9383107Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2022-11-23T01:12:49.0510916Z Submodule path 'third_party/kineto': checked out '6c1629809068efd78a8d56b4aa479c7ec49ae562' 2022-11-23T01:12:49.0529535Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:12:49.0532699Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:12:49.0560142Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2022-11-23T01:12:50.1539106Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2022-11-23T01:12:51.1717034Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '2591ab91c3898c9f6544fff04660276537d32ffd' 2022-11-23T01:12:51.2386664Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2022-11-23T01:12:51.2632902Z Submodule path 'third_party/nccl/nccl': checked out 'f89fd4777d2ef9229c039ff750ae21da01626f52' 2022-11-23T01:12:51.2797993Z Submodule path 'third_party/neon2sse': checked out '97a126f08ce318023be604d03f88bf0820a9464a' 2022-11-23T01:12:51.4151904Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2022-11-23T01:12:51.7338498Z Submodule path 'third_party/onnx': checked out 'f7ee1ac60d06abe8e26c9b6bbe1e3db5286b614b' 2022-11-23T01:12:51.7373775Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2022-11-23T01:12:51.7377128Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2022-11-23T01:12:51.7405014Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2022-11-23T01:12:52.1439879Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2022-11-23T01:12:53.0038925Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-11-23T01:12:53.0422526Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'ffa346860b306c9bbfb341aed9c14c067751feb8' 2022-11-23T01:12:53.0599782Z Submodule path 'third_party/onnx-tensorrt': checked out 'c153211418a7c57ce071d9ce2a41f8d1c85a878f' 2022-11-23T01:12:53.0616553Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:12:53.0642717Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx'... 2022-11-23T01:12:54.9002579Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx': checked out '765f5ee823a67a866f4bd28a9860e81f3c811ce8' 2022-11-23T01:12:54.9024853Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:12:54.9028117Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:12:54.9056272Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'... 2022-11-23T01:12:55.3197780Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'... 2022-11-23T01:12:56.2126179Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508' 2022-11-23T01:12:56.2914013Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c' 2022-11-23T01:12:56.2930911Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:12:56.2958309Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'... 2022-11-23T01:12:57.7490171Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-11-23T01:12:57.7597309Z Submodule path 'third_party/pocketfft': checked out 'ea778e37710c07723435b1be58235996d1d43a5a' 2022-11-23T01:12:58.0808373Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2022-11-23T01:12:58.0831489Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:12:58.0834732Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2022-11-23T01:12:58.0863028Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2022-11-23T01:12:58.5070675Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2022-11-23T01:12:59.5042220Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2022-11-23T01:12:59.5864815Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2022-11-23T01:12:59.5960845Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2022-11-23T01:12:59.6085772Z Submodule path 'third_party/pthreadpool': checked out 'a134dd5d4cee80cce15db81a72e7f929d71dd413' 2022-11-23T01:12:59.6490618Z Submodule path 'third_party/pybind11': checked out '80dc998efced8ceb2be59756668a7e90e8bef917' 2022-11-23T01:12:59.6588181Z Submodule path 'third_party/python-enum': checked out '4cfedc426c4e2fc52e3f5c2b4297e15ed8d6b8c7' 2022-11-23T01:12:59.6931442Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2022-11-23T01:12:59.7036412Z Submodule path 'third_party/python-six': checked out '15e31431af97e5e64b80af0a3f598d382bcdd49a' 2022-11-23T01:12:59.7574488Z Submodule path 'third_party/sleef': checked out 'e0a003ee838b75d11763aa9c3ef17bf71a725bff' 2022-11-23T01:12:59.8936006Z Submodule path 'third_party/tbb': checked out 'a51a90bc609bb73db8ea13841b5cf7aa4344d4a9' 2022-11-23T01:12:59.9253207Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2022-11-23T01:12:59.9271526Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:12:59.9274653Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:12:59.9278823Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:12:59.9282064Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:12:59.9308176Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2022-11-23T01:13:00.9457721Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2022-11-23T01:13:01.3529582Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2022-11-23T01:13:02.7382549Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2022-11-23T01:13:03.7701420Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2022-11-23T01:13:03.7873436Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2022-11-23T01:13:03.8669315Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2022-11-23T01:13:03.9009625Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2022-11-23T01:13:03.9027976Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:03.9055689Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2022-11-23T01:13:04.1516884Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-11-23T01:13:04.3125698Z Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8' 2022-11-23T01:13:04.3188029Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2022-11-23T01:13:04.3531395Z Entering 'android/libs/fbjni' 2022-11-23T01:13:04.3574883Z Entering 'third_party/FP16' 2022-11-23T01:13:04.3618100Z Entering 'third_party/FXdiv' 2022-11-23T01:13:04.3663770Z Entering 'third_party/NNPACK' 2022-11-23T01:13:04.3708147Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:04.3751719Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:04.3795366Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:04.3851178Z Entering 'third_party/benchmark' 2022-11-23T01:13:04.3894445Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:04.3938459Z Entering 'third_party/cub' 2022-11-23T01:13:04.3981694Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:04.4030538Z Entering 'third_party/cutlass' 2022-11-23T01:13:04.4082998Z Entering 'third_party/eigen' 2022-11-23T01:13:04.4128278Z Entering 'third_party/fbgemm' 2022-11-23T01:13:04.4170414Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:04.4213183Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:04.4256488Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:04.4299853Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:04.4343309Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:04.4388584Z Entering 'third_party/fmt' 2022-11-23T01:13:04.4431065Z Entering 'third_party/foxi' 2022-11-23T01:13:04.4474250Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:04.4516863Z Entering 'third_party/gloo' 2022-11-23T01:13:04.4560755Z Entering 'third_party/googletest' 2022-11-23T01:13:04.4603696Z Entering 'third_party/ideep' 2022-11-23T01:13:04.4646737Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:04.4692088Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:04.4742371Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:04.4784518Z Entering 'third_party/ittapi' 2022-11-23T01:13:04.4827744Z Entering 'third_party/kineto' 2022-11-23T01:13:04.4869513Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:04.4911088Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:04.4955465Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:04.4998330Z Entering 'third_party/neon2sse' 2022-11-23T01:13:04.5040462Z Entering 'third_party/nlohmann' 2022-11-23T01:13:04.5084714Z Entering 'third_party/onnx' 2022-11-23T01:13:04.5139906Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:04.5182517Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:04.5227326Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:04.5268755Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:04.5317792Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:04.5360867Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:04.5403768Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:04.5450620Z Entering 'third_party/pocketfft' 2022-11-23T01:13:04.5492867Z Entering 'third_party/protobuf' 2022-11-23T01:13:04.5540379Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:04.5583082Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:04.5627248Z Entering 'third_party/psimd' 2022-11-23T01:13:04.5669252Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:04.5711753Z Entering 'third_party/pybind11' 2022-11-23T01:13:04.5757033Z Entering 'third_party/python-enum' 2022-11-23T01:13:04.5798979Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:04.5841365Z Entering 'third_party/python-six' 2022-11-23T01:13:04.5884312Z Entering 'third_party/sleef' 2022-11-23T01:13:04.5927115Z Entering 'third_party/tbb' 2022-11-23T01:13:04.5970941Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:04.6014327Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:04.6056539Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:04.6100036Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:04.6141802Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:04.6183373Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:04.6228628Z Entering 'third_party/zstd' 2022-11-23T01:13:04.6283661Z ##[endgroup] 2022-11-23T01:13:04.6285676Z ##[group]Persisting credentials for submodules 2022-11-23T01:13:04.6291281Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || : 2022-11-23T01:13:04.6617014Z Entering 'android/libs/fbjni' 2022-11-23T01:13:04.6659254Z Entering 'third_party/FP16' 2022-11-23T01:13:04.6701921Z Entering 'third_party/FXdiv' 2022-11-23T01:13:04.6746299Z Entering 'third_party/NNPACK' 2022-11-23T01:13:04.6787571Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:04.6829728Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:04.6871463Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:04.6927622Z Entering 'third_party/benchmark' 2022-11-23T01:13:04.6969669Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:04.7011188Z Entering 'third_party/cub' 2022-11-23T01:13:04.7053146Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:04.7101633Z Entering 'third_party/cutlass' 2022-11-23T01:13:04.7151075Z Entering 'third_party/eigen' 2022-11-23T01:13:04.7197929Z Entering 'third_party/fbgemm' 2022-11-23T01:13:04.7240015Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:04.7281690Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:04.7324031Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:04.7365316Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:04.7408101Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:04.7452194Z Entering 'third_party/fmt' 2022-11-23T01:13:04.7494431Z Entering 'third_party/foxi' 2022-11-23T01:13:04.7537510Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:04.7580136Z Entering 'third_party/gloo' 2022-11-23T01:13:04.7622139Z Entering 'third_party/googletest' 2022-11-23T01:13:04.7665042Z Entering 'third_party/ideep' 2022-11-23T01:13:04.7705915Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:04.7751375Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:04.7802566Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:04.7847223Z Entering 'third_party/ittapi' 2022-11-23T01:13:04.7890104Z Entering 'third_party/kineto' 2022-11-23T01:13:04.7931965Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:04.7973403Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:04.8018124Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:04.8059791Z Entering 'third_party/neon2sse' 2022-11-23T01:13:04.8101422Z Entering 'third_party/nlohmann' 2022-11-23T01:13:04.8145562Z Entering 'third_party/onnx' 2022-11-23T01:13:04.8201812Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:04.8243803Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:04.8287433Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:04.8328993Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:04.8375306Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:04.8416498Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:04.8457647Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:04.8503918Z Entering 'third_party/pocketfft' 2022-11-23T01:13:04.8544848Z Entering 'third_party/protobuf' 2022-11-23T01:13:04.8591065Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:04.8632348Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:04.8685750Z Entering 'third_party/psimd' 2022-11-23T01:13:04.8727434Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:04.8769262Z Entering 'third_party/pybind11' 2022-11-23T01:13:04.8810193Z Entering 'third_party/python-enum' 2022-11-23T01:13:04.8851948Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:04.8893581Z Entering 'third_party/python-six' 2022-11-23T01:13:04.8935085Z Entering 'third_party/sleef' 2022-11-23T01:13:04.8976482Z Entering 'third_party/tbb' 2022-11-23T01:13:04.9020741Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:04.9062587Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:04.9103365Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:04.9144933Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:04.9185713Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:04.9226572Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:04.9271765Z Entering 'third_party/zstd' 2022-11-23T01:13:04.9341967Z [command]/usr/bin/git submodule foreach --recursive git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url 2022-11-23T01:13:04.9664007Z Entering 'android/libs/fbjni' 2022-11-23T01:13:04.9704417Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2022-11-23T01:13:04.9722745Z Entering 'third_party/FP16' 2022-11-23T01:13:04.9763608Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2022-11-23T01:13:04.9781590Z Entering 'third_party/FXdiv' 2022-11-23T01:13:04.9821655Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2022-11-23T01:13:04.9840397Z Entering 'third_party/NNPACK' 2022-11-23T01:13:04.9880885Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2022-11-23T01:13:04.9898737Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:04.9937744Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/QNNPACK/config remote.origin.url 2022-11-23T01:13:04.9955845Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:05.0022420Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2022-11-23T01:13:05.0040512Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:05.0079878Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2022-11-23T01:13:05.0109339Z Entering 'third_party/benchmark' 2022-11-23T01:13:05.0148819Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:05.0167820Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:05.0206990Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2022-11-23T01:13:05.0224876Z Entering 'third_party/cub' 2022-11-23T01:13:05.0264476Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cub/config remote.origin.url 2022-11-23T01:13:05.0282558Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:05.0328982Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2022-11-23T01:13:05.0352223Z Entering 'third_party/cutlass' 2022-11-23T01:13:05.0390510Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2022-11-23T01:13:05.0415222Z Entering 'third_party/eigen' 2022-11-23T01:13:05.0455253Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2022-11-23T01:13:05.0476437Z Entering 'third_party/fbgemm' 2022-11-23T01:13:05.0515288Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2022-11-23T01:13:05.0535122Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:05.0573838Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2022-11-23T01:13:05.0591200Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:05.0629786Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2022-11-23T01:13:05.0648174Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:05.0687195Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:05.0704387Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:05.0742514Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2022-11-23T01:13:05.0761207Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:05.0800644Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2022-11-23T01:13:05.0819970Z Entering 'third_party/fmt' 2022-11-23T01:13:05.0858685Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2022-11-23T01:13:05.0876092Z Entering 'third_party/foxi' 2022-11-23T01:13:05.0914022Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2022-11-23T01:13:05.0931536Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:05.0971274Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2022-11-23T01:13:05.0988214Z Entering 'third_party/gloo' 2022-11-23T01:13:05.1026531Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2022-11-23T01:13:05.1044508Z Entering 'third_party/googletest' 2022-11-23T01:13:05.1083223Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:05.1100974Z Entering 'third_party/ideep' 2022-11-23T01:13:05.1139861Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2022-11-23T01:13:05.1156433Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:05.1195095Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2022-11-23T01:13:05.1215959Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:05.1255966Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/modules/third_party/oneDNN/config remote.origin.url 2022-11-23T01:13:05.1280873Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:05.1320694Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ios-cmake/config remote.origin.url 2022-11-23T01:13:05.1337790Z Entering 'third_party/ittapi' 2022-11-23T01:13:05.1376667Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2022-11-23T01:13:05.1393553Z Entering 'third_party/kineto' 2022-11-23T01:13:05.1432365Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2022-11-23T01:13:05.1450250Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:05.1489082Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2022-11-23T01:13:05.1506792Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:05.1545982Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2022-11-23T01:13:05.1565661Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:05.1607646Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2022-11-23T01:13:05.1624955Z Entering 'third_party/neon2sse' 2022-11-23T01:13:05.1665964Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/neon2sse/config remote.origin.url 2022-11-23T01:13:05.1683883Z Entering 'third_party/nlohmann' 2022-11-23T01:13:05.1722314Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2022-11-23T01:13:05.1740836Z Entering 'third_party/onnx' 2022-11-23T01:13:05.1779636Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2022-11-23T01:13:05.1810527Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:05.1849944Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:05.1867462Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:05.1907096Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:05.1927356Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:05.1967267Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/config remote.origin.url 2022-11-23T01:13:05.1984261Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:05.2022804Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/config remote.origin.url 2022-11-23T01:13:05.2046953Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:05.2086690Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:05.2103970Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:05.2143047Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:05.2160391Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:05.2200715Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-11-23T01:13:05.2223081Z Entering 'third_party/pocketfft' 2022-11-23T01:13:05.2263301Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2022-11-23T01:13:05.2281196Z Entering 'third_party/protobuf' 2022-11-23T01:13:05.2320238Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2022-11-23T01:13:05.2340814Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:05.2381809Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:05.2399671Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:05.2438179Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:05.2457463Z Entering 'third_party/psimd' 2022-11-23T01:13:05.2496874Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2022-11-23T01:13:05.2514158Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:05.2552467Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2022-11-23T01:13:05.2569561Z Entering 'third_party/pybind11' 2022-11-23T01:13:05.2608614Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:05.2626230Z Entering 'third_party/python-enum' 2022-11-23T01:13:05.2665223Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-enum/config remote.origin.url 2022-11-23T01:13:05.2682688Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:05.2722199Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2022-11-23T01:13:05.2739503Z Entering 'third_party/python-six' 2022-11-23T01:13:05.2778536Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-six/config remote.origin.url 2022-11-23T01:13:05.2795460Z Entering 'third_party/sleef' 2022-11-23T01:13:05.2834681Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2022-11-23T01:13:05.2852625Z Entering 'third_party/tbb' 2022-11-23T01:13:05.2942318Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tbb/config remote.origin.url 2022-11-23T01:13:05.2963777Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:05.3002828Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2022-11-23T01:13:05.3020162Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:05.3058746Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:05.3075893Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:05.3114101Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2022-11-23T01:13:05.3131286Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:05.3171699Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2022-11-23T01:13:05.3189016Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:05.3227526Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:05.3244850Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:05.3283411Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-11-23T01:13:05.3302955Z Entering 'third_party/zstd' 2022-11-23T01:13:05.3342294Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/zstd/config remote.origin.url 2022-11-23T01:13:05.4202121Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2022-11-23T01:13:05.4523218Z Entering 'android/libs/fbjni' 2022-11-23T01:13:05.4566641Z Entering 'third_party/FP16' 2022-11-23T01:13:05.4610275Z Entering 'third_party/FXdiv' 2022-11-23T01:13:05.4655140Z Entering 'third_party/NNPACK' 2022-11-23T01:13:05.4699276Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:05.4742622Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:05.4786413Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:05.4841719Z Entering 'third_party/benchmark' 2022-11-23T01:13:05.4886013Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:05.4930153Z Entering 'third_party/cub' 2022-11-23T01:13:05.4973751Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:05.5023068Z Entering 'third_party/cutlass' 2022-11-23T01:13:05.5073306Z Entering 'third_party/eigen' 2022-11-23T01:13:05.5120523Z Entering 'third_party/fbgemm' 2022-11-23T01:13:05.5164751Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:05.5208547Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:05.5252485Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:05.5298605Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:05.5344964Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:05.5391292Z Entering 'third_party/fmt' 2022-11-23T01:13:05.5437405Z Entering 'third_party/foxi' 2022-11-23T01:13:05.5482592Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:05.5528996Z Entering 'third_party/gloo' 2022-11-23T01:13:05.5574756Z Entering 'third_party/googletest' 2022-11-23T01:13:05.5620064Z Entering 'third_party/ideep' 2022-11-23T01:13:05.5664335Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:05.5711476Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:05.5764418Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:05.5809467Z Entering 'third_party/ittapi' 2022-11-23T01:13:05.5854883Z Entering 'third_party/kineto' 2022-11-23T01:13:05.5899305Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:05.5942273Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:05.5987554Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:05.6031658Z Entering 'third_party/neon2sse' 2022-11-23T01:13:05.6075265Z Entering 'third_party/nlohmann' 2022-11-23T01:13:05.6120440Z Entering 'third_party/onnx' 2022-11-23T01:13:05.6179263Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:05.6224470Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:05.6270217Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:05.6313054Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:05.6362156Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:05.6405831Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:05.6450150Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:05.6499143Z Entering 'third_party/pocketfft' 2022-11-23T01:13:05.6546535Z Entering 'third_party/protobuf' 2022-11-23T01:13:05.6594078Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:05.6643946Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:05.6689013Z Entering 'third_party/psimd' 2022-11-23T01:13:05.6733279Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:05.6778593Z Entering 'third_party/pybind11' 2022-11-23T01:13:05.6822793Z Entering 'third_party/python-enum' 2022-11-23T01:13:05.6866290Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:05.6909641Z Entering 'third_party/python-six' 2022-11-23T01:13:05.6953877Z Entering 'third_party/sleef' 2022-11-23T01:13:05.6997288Z Entering 'third_party/tbb' 2022-11-23T01:13:05.7042591Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:05.7088497Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:05.7131472Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:05.7174614Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:05.7217883Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:05.7260691Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:05.7306181Z Entering 'third_party/zstd' 2022-11-23T01:13:05.7363852Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2022-11-23T01:13:05.7687822Z Entering 'android/libs/fbjni' 2022-11-23T01:13:05.7730589Z Entering 'third_party/FP16' 2022-11-23T01:13:05.7775922Z Entering 'third_party/FXdiv' 2022-11-23T01:13:05.7819667Z Entering 'third_party/NNPACK' 2022-11-23T01:13:05.7863620Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:05.7907794Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:05.7952167Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:05.8007515Z Entering 'third_party/benchmark' 2022-11-23T01:13:05.8050300Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:05.8094508Z Entering 'third_party/cub' 2022-11-23T01:13:05.8137896Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:05.8188706Z Entering 'third_party/cutlass' 2022-11-23T01:13:05.8239673Z Entering 'third_party/eigen' 2022-11-23T01:13:05.8285632Z Entering 'third_party/fbgemm' 2022-11-23T01:13:05.8328518Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:05.8370696Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:05.8412988Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:05.8455228Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:05.8499707Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:05.8544488Z Entering 'third_party/fmt' 2022-11-23T01:13:05.8587684Z Entering 'third_party/foxi' 2022-11-23T01:13:05.8630685Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:05.8673923Z Entering 'third_party/gloo' 2022-11-23T01:13:05.8718477Z Entering 'third_party/googletest' 2022-11-23T01:13:05.8762373Z Entering 'third_party/ideep' 2022-11-23T01:13:05.8803933Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:05.8848886Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:05.8898002Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:05.8944006Z Entering 'third_party/ittapi' 2022-11-23T01:13:05.8987563Z Entering 'third_party/kineto' 2022-11-23T01:13:05.9030473Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:05.9073365Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:05.9118136Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:05.9161661Z Entering 'third_party/neon2sse' 2022-11-23T01:13:05.9204512Z Entering 'third_party/nlohmann' 2022-11-23T01:13:05.9248994Z Entering 'third_party/onnx' 2022-11-23T01:13:05.9305624Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:05.9348865Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:05.9394514Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:05.9437567Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:05.9486493Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:05.9531105Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:05.9576299Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:05.9625578Z Entering 'third_party/pocketfft' 2022-11-23T01:13:05.9669995Z Entering 'third_party/protobuf' 2022-11-23T01:13:05.9720238Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:05.9764556Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:05.9811800Z Entering 'third_party/psimd' 2022-11-23T01:13:05.9857210Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:05.9902743Z Entering 'third_party/pybind11' 2022-11-23T01:13:05.9947275Z Entering 'third_party/python-enum' 2022-11-23T01:13:05.9991017Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:06.0037241Z Entering 'third_party/python-six' 2022-11-23T01:13:06.0082551Z Entering 'third_party/sleef' 2022-11-23T01:13:06.0126405Z Entering 'third_party/tbb' 2022-11-23T01:13:06.0172490Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:06.0217256Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:06.0260270Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:06.0302420Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:06.0345076Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:06.0387759Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:06.0434673Z Entering 'third_party/zstd' 2022-11-23T01:13:06.0497390Z ##[endgroup] 2022-11-23T01:13:06.0543591Z [command]/usr/bin/git log -1 --format='%H' 2022-11-23T01:13:06.0573598Z '1cfd3858ac54fe3883534309081631a0a892ba3f' 2022-11-23T01:13:06.0723992Z Prepare all required actions 2022-11-23T01:13:06.0754392Z ##[group]Run ./.github/actions/setup-linux 2022-11-23T01:13:06.0754677Z env: 2022-11-23T01:13:06.0754921Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:06.0755817Z ##[endgroup] 2022-11-23T01:13:06.0773082Z ##[group]Run set -euo pipefail 2022-11-23T01:13:06.0773395Z set -euo pipefail 2022-11-23T01:13:06.0813661Z function get_ec2_metadata() { 2022-11-23T01:13:06.0814068Z  # Pulled from instance metadata endpoint for EC2 2022-11-23T01:13:06.0814547Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2022-11-23T01:13:06.0814952Z  category=$1 2022-11-23T01:13:06.0815284Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2022-11-23T01:13:06.0815601Z } 2022-11-23T01:13:06.0815867Z echo "ami-id: $(get_ec2_metadata ami-id)" 2022-11-23T01:13:06.0816249Z echo "instance-id: $(get_ec2_metadata instance-id)" 2022-11-23T01:13:06.0816625Z echo "instance-type: $(get_ec2_metadata instance-type)" 2022-11-23T01:13:06.0816952Z echo "system info $(uname -a)" 2022-11-23T01:13:06.0830545Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:06.0830827Z env: 2022-11-23T01:13:06.0831044Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:06.0831282Z ##[endgroup] 2022-11-23T01:13:06.0934064Z ami-id: ami-096198a0bccc6bad4 2022-11-23T01:13:06.0997910Z instance-id: i-088dc030290e38a53 2022-11-23T01:13:06.1059899Z instance-type: g3.8xlarge 2022-11-23T01:13:06.1068140Z system info Linux ip-10-0-2-109.ec2.internal 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 2022-11-23T01:13:06.1087511Z ##[group]Run if systemctl is-active --quiet docker; then 2022-11-23T01:13:06.1087880Z if systemctl is-active --quiet docker; then 2022-11-23T01:13:06.1088222Z  echo "Docker daemon is running..."; 2022-11-23T01:13:06.1088509Z else 2022-11-23T01:13:06.1088833Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2022-11-23T01:13:06.1089134Z fi 2022-11-23T01:13:06.1101160Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:06.1101453Z env: 2022-11-23T01:13:06.1101678Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:06.1101939Z ##[endgroup] 2022-11-23T01:13:06.1152682Z Docker daemon is running... 2022-11-23T01:13:06.1171641Z ##[group]Run AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-11-23T01:13:06.1172099Z AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-11-23T01:13:06.1172490Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:13:06.1173000Z retry aws ecr get-login*** "$AWS_DEFAULT_REGION" | docker login --username AWS \ 2022-11-23T01:13:06.1173476Z  --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2022-11-23T01:13:06.1185245Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:06.1185550Z env: 2022-11-23T01:13:06.1185794Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:06.1186069Z AWS_RETRY_MODE: standard 2022-11-23T01:13:06.1186312Z AWS_MAX_ATTEMPTS: 5 2022-11-23T01:13:06.1186585Z AWS_DEFAULT_REGION: us-east-1 2022-11-23T01:13:06.1186843Z ##[endgroup] 2022-11-23T01:13:07.2152923Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2022-11-23T01:13:07.2153405Z Configure a credential helper to remove this warning. See 2022-11-23T01:13:07.2153929Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2022-11-23T01:13:07.2154515Z 2022-11-23T01:13:07.2154758Z Login Succeeded 2022-11-23T01:13:07.2196386Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:13:07.2196801Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:13:07.2197287Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:13:07.2210459Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:07.2210742Z env: 2022-11-23T01:13:07.2210987Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:07.2211249Z ##[endgroup] 2022-11-23T01:13:07.2300221Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2022-11-23T01:13:07.2300600Z with: 2022-11-23T01:13:07.2301086Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:13:07.2301538Z env: 2022-11-23T01:13:07.2301783Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:07.2302049Z ##[endgroup] 2022-11-23T01:13:07.2319449Z ##[group]Run retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:13:07.2319826Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:13:07.2320194Z # ignore output since only exit code is used for conditional 2022-11-23T01:13:07.2320584Z # only pull docker image if it's not available locally 2022-11-23T01:13:07.2320997Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2022-11-23T01:13:07.2321405Z  retry docker pull "${DOCKER_IMAGE}" 2022-11-23T01:13:07.2321684Z fi 2022-11-23T01:13:07.2333946Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:07.2334249Z env: 2022-11-23T01:13:07.2334493Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:07.2334991Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:13:07.2335472Z ##[endgroup] 2022-11-23T01:13:07.4651632Z 072aae4a77ed7d3a69ad5683420509c41301b940: Pulling from pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7 2022-11-23T01:13:07.4652121Z a404e5416296: Pulling fs layer 2022-11-23T01:13:07.4652384Z 42d021f37342: Pulling fs layer 2022-11-23T01:13:07.4652660Z 9dab8401a678: Pulling fs layer 2022-11-23T01:13:07.4652940Z 2fc72180b8eb: Pulling fs layer 2022-11-23T01:13:07.4655741Z 16e6a4d496ed: Pulling fs layer 2022-11-23T01:13:07.4656366Z 3a4c0b092a9e: Pulling fs layer 2022-11-23T01:13:07.4656982Z d9e19734e968: Pulling fs layer 2022-11-23T01:13:07.4657381Z 8859cc6b3ab7: Pulling fs layer 2022-11-23T01:13:07.4657808Z 1a4b44db2103: Pulling fs layer 2022-11-23T01:13:07.4658385Z fac00e927cfe: Pulling fs layer 2022-11-23T01:13:07.4658891Z 2133f4081ddb: Pulling fs layer 2022-11-23T01:13:07.4659383Z 3ceac802dd07: Pulling fs layer 2022-11-23T01:13:07.4659720Z 69c929400d35: Pulling fs layer 2022-11-23T01:13:07.4660000Z bc2be817cb7e: Pulling fs layer 2022-11-23T01:13:07.4660252Z fac00e927cfe: Waiting 2022-11-23T01:13:07.4660544Z d04926f1c68d: Pulling fs layer 2022-11-23T01:13:07.4660827Z 91f116f19c0d: Pulling fs layer 2022-11-23T01:13:07.4661087Z a7cf5b3894f8: Pulling fs layer 2022-11-23T01:13:07.4661363Z 722cddd823f9: Pulling fs layer 2022-11-23T01:13:07.4661634Z 2e882087d824: Pulling fs layer 2022-11-23T01:13:07.4661888Z ba6235196410: Pulling fs layer 2022-11-23T01:13:07.4662211Z 313c5ee380ab: Pulling fs layer 2022-11-23T01:13:07.4662473Z d9e19734e968: Waiting 2022-11-23T01:13:07.4662739Z 6ff0fc00b0a9: Pulling fs layer 2022-11-23T01:13:07.4662991Z 8859cc6b3ab7: Waiting 2022-11-23T01:13:07.4663247Z 3ceac802dd07: Waiting 2022-11-23T01:13:07.4663516Z 06f043acdbfb: Pulling fs layer 2022-11-23T01:13:07.4663777Z a8c562f6a1cf: Pulling fs layer 2022-11-23T01:13:07.4681693Z 69c929400d35: Waiting 2022-11-23T01:13:07.4682342Z 4468cd4f574c: Pulling fs layer 2022-11-23T01:13:07.4682885Z 46e5d3cec398: Pulling fs layer 2022-11-23T01:13:07.4683386Z 165006759af3: Pulling fs layer 2022-11-23T01:13:07.4683868Z ba6235196410: Waiting 2022-11-23T01:13:07.4684686Z 2c660101e145: Pulling fs layer 2022-11-23T01:13:07.4685077Z 2e882087d824: Waiting 2022-11-23T01:13:07.4685322Z 8ce731125a4e: Pulling fs layer 2022-11-23T01:13:07.4685595Z 102ddcd90753: Pulling fs layer 2022-11-23T01:13:07.4686132Z 16e6a4d496ed: Waiting 2022-11-23T01:13:07.4686378Z 3a4c0b092a9e: Waiting 2022-11-23T01:13:07.4686637Z 0cff5716a932: Pulling fs layer 2022-11-23T01:13:07.4686902Z a8c562f6a1cf: Waiting 2022-11-23T01:13:07.4687151Z 867a6ccb577f: Pulling fs layer 2022-11-23T01:13:07.4687519Z 4468cd4f574c: Waiting 2022-11-23T01:13:07.4687788Z 863c35620b44: Pulling fs layer 2022-11-23T01:13:07.4688045Z 2c660101e145: Waiting 2022-11-23T01:13:07.4688308Z f828f00c6d66: Pulling fs layer 2022-11-23T01:13:07.4688577Z f0fe61569b0b: Pulling fs layer 2022-11-23T01:13:07.4688817Z 102ddcd90753: Waiting 2022-11-23T01:13:07.4689075Z a9f3d4742233: Pulling fs layer 2022-11-23T01:13:07.4689327Z 863c35620b44: Waiting 2022-11-23T01:13:07.4689569Z 0cff5716a932: Waiting 2022-11-23T01:13:07.4689826Z 000b6751ea6f: Pulling fs layer 2022-11-23T01:13:07.4690097Z 023a41fa48e6: Pulling fs layer 2022-11-23T01:13:07.4690360Z f0fe61569b0b: Waiting 2022-11-23T01:13:07.4690596Z 081025f05026: Pulling fs layer 2022-11-23T01:13:07.4690864Z 5970defc1d8b: Pulling fs layer 2022-11-23T01:13:07.4691125Z a7cf5b3894f8: Waiting 2022-11-23T01:13:07.4691354Z a9f3d4742233: Waiting 2022-11-23T01:13:07.4691611Z 7e2d6313145f: Pulling fs layer 2022-11-23T01:13:07.4691869Z 6ff0fc00b0a9: Waiting 2022-11-23T01:13:07.4692116Z 4b4d66451d67: Pulling fs layer 2022-11-23T01:13:07.4692396Z 75f1ead35ace: Pulling fs layer 2022-11-23T01:13:07.4692666Z 793c37004dab: Pulling fs layer 2022-11-23T01:13:07.4692907Z 1a4b44db2103: Waiting 2022-11-23T01:13:07.4693165Z 4f1313d71da0: Pulling fs layer 2022-11-23T01:13:07.4693426Z 4b4d66451d67: Waiting 2022-11-23T01:13:07.4693669Z 6386b2adbe28: Pulling fs layer 2022-11-23T01:13:07.4693926Z 313c5ee380ab: Waiting 2022-11-23T01:13:07.4694167Z 793c37004dab: Waiting 2022-11-23T01:13:07.4697111Z 4f1313d71da0: Waiting 2022-11-23T01:13:07.4697558Z f21c39caec7a: Pulling fs layer 2022-11-23T01:13:07.4698098Z 2baaae32e3d5: Pulling fs layer 2022-11-23T01:13:07.4698380Z 128ebe6811c5: Pulling fs layer 2022-11-23T01:13:07.4698625Z 2baaae32e3d5: Waiting 2022-11-23T01:13:07.4698890Z 772fa4efddc3: Pulling fs layer 2022-11-23T01:13:07.4699170Z c36ac12376c5: Pulling fs layer 2022-11-23T01:13:07.4699410Z 8ce731125a4e: Waiting 2022-11-23T01:13:07.4699678Z 85cc9b957510: Pulling fs layer 2022-11-23T01:13:07.4699952Z 9ab0826d88b5: Pulling fs layer 2022-11-23T01:13:07.4700196Z 6386b2adbe28: Waiting 2022-11-23T01:13:07.4700445Z c36ac12376c5: Waiting 2022-11-23T01:13:07.4700684Z 9ab0826d88b5: Waiting 2022-11-23T01:13:07.4700903Z 165006759af3: Waiting 2022-11-23T01:13:07.4701144Z 772fa4efddc3: Waiting 2022-11-23T01:13:07.4701387Z 91f116f19c0d: Waiting 2022-11-23T01:13:07.4701607Z 85cc9b957510: Waiting 2022-11-23T01:13:07.4701847Z f828f00c6d66: Waiting 2022-11-23T01:13:07.4702086Z 023a41fa48e6: Waiting 2022-11-23T01:13:07.4702310Z 867a6ccb577f: Waiting 2022-11-23T01:13:07.4702558Z 000b6751ea6f: Waiting 2022-11-23T01:13:07.6108760Z 42d021f37342: Verifying Checksum 2022-11-23T01:13:07.6109127Z 42d021f37342: Download complete 2022-11-23T01:13:07.6917460Z 2fc72180b8eb: Verifying Checksum 2022-11-23T01:13:07.6917787Z 2fc72180b8eb: Download complete 2022-11-23T01:13:07.7669043Z 16e6a4d496ed: Download complete 2022-11-23T01:13:07.7837260Z a404e5416296: Download complete 2022-11-23T01:13:07.8644859Z d9e19734e968: Download complete 2022-11-23T01:13:08.0029520Z 9dab8401a678: Verifying Checksum 2022-11-23T01:13:08.0030059Z 9dab8401a678: Download complete 2022-11-23T01:13:08.0800267Z 1a4b44db2103: Verifying Checksum 2022-11-23T01:13:08.0800676Z 1a4b44db2103: Download complete 2022-11-23T01:13:08.1638744Z fac00e927cfe: Verifying Checksum 2022-11-23T01:13:08.1639357Z fac00e927cfe: Download complete 2022-11-23T01:13:08.5515373Z a404e5416296: Pull complete 2022-11-23T01:13:08.8309336Z 42d021f37342: Pull complete 2022-11-23T01:13:09.6962693Z 9dab8401a678: Pull complete 2022-11-23T01:13:09.8507451Z 2fc72180b8eb: Pull complete 2022-11-23T01:13:09.9873935Z 16e6a4d496ed: Pull complete 2022-11-23T01:13:10.2196569Z 2133f4081ddb: Verifying Checksum 2022-11-23T01:13:10.2197275Z 2133f4081ddb: Download complete 2022-11-23T01:13:10.2923406Z 3ceac802dd07: Verifying Checksum 2022-11-23T01:13:10.2923765Z 3ceac802dd07: Download complete 2022-11-23T01:13:10.3521430Z 69c929400d35: Verifying Checksum 2022-11-23T01:13:10.3521869Z 69c929400d35: Download complete 2022-11-23T01:13:10.4393600Z bc2be817cb7e: Verifying Checksum 2022-11-23T01:13:10.4393971Z bc2be817cb7e: Download complete 2022-11-23T01:13:11.1624931Z d04926f1c68d: Verifying Checksum 2022-11-23T01:13:11.1625296Z d04926f1c68d: Download complete 2022-11-23T01:13:11.2391935Z 91f116f19c0d: Verifying Checksum 2022-11-23T01:13:11.2392539Z 91f116f19c0d: Download complete 2022-11-23T01:13:11.3125162Z a7cf5b3894f8: Verifying Checksum 2022-11-23T01:13:11.3125500Z a7cf5b3894f8: Download complete 2022-11-23T01:13:18.6869043Z 3a4c0b092a9e: Download complete 2022-11-23T01:13:18.7705358Z 2e882087d824: Verifying Checksum 2022-11-23T01:13:18.7705912Z 2e882087d824: Download complete 2022-11-23T01:13:18.8600765Z ba6235196410: Download complete 2022-11-23T01:13:18.9361904Z 313c5ee380ab: Download complete 2022-11-23T01:13:19.0162130Z 6ff0fc00b0a9: Verifying Checksum 2022-11-23T01:13:19.0162481Z 6ff0fc00b0a9: Download complete 2022-11-23T01:13:19.1145942Z 06f043acdbfb: Verifying Checksum 2022-11-23T01:13:19.1146607Z 06f043acdbfb: Download complete 2022-11-23T01:13:19.1815679Z a8c562f6a1cf: Verifying Checksum 2022-11-23T01:13:19.1816099Z a8c562f6a1cf: Download complete 2022-11-23T01:13:20.1298339Z 4468cd4f574c: Verifying Checksum 2022-11-23T01:13:20.1298997Z 4468cd4f574c: Download complete 2022-11-23T01:13:20.2021073Z 46e5d3cec398: Verifying Checksum 2022-11-23T01:13:20.2021722Z 46e5d3cec398: Download complete 2022-11-23T01:13:20.2725697Z 165006759af3: Verifying Checksum 2022-11-23T01:13:20.2726308Z 165006759af3: Download complete 2022-11-23T01:13:20.3567576Z 2c660101e145: Verifying Checksum 2022-11-23T01:13:20.3568162Z 2c660101e145: Download complete 2022-11-23T01:13:20.4245334Z 8ce731125a4e: Download complete 2022-11-23T01:13:20.5103670Z 102ddcd90753: Verifying Checksum 2022-11-23T01:13:20.5104311Z 102ddcd90753: Download complete 2022-11-23T01:13:22.0225002Z 8859cc6b3ab7: Verifying Checksum 2022-11-23T01:13:22.0225685Z 8859cc6b3ab7: Download complete 2022-11-23T01:13:22.1096303Z 867a6ccb577f: Download complete 2022-11-23T01:13:22.1860754Z 863c35620b44: Verifying Checksum 2022-11-23T01:13:22.1861108Z 863c35620b44: Download complete 2022-11-23T01:13:22.4975178Z 0cff5716a932: Download complete 2022-11-23T01:13:22.5655965Z f0fe61569b0b: Download complete 2022-11-23T01:13:22.5850974Z f828f00c6d66: Verifying Checksum 2022-11-23T01:13:22.5851594Z f828f00c6d66: Download complete 2022-11-23T01:13:22.6334818Z a9f3d4742233: Verifying Checksum 2022-11-23T01:13:22.6335754Z a9f3d4742233: Download complete 2022-11-23T01:13:22.7124243Z 023a41fa48e6: Verifying Checksum 2022-11-23T01:13:22.7124892Z 023a41fa48e6: Download complete 2022-11-23T01:13:22.8518572Z 000b6751ea6f: Verifying Checksum 2022-11-23T01:13:22.8518978Z 000b6751ea6f: Download complete 2022-11-23T01:13:22.9297736Z 5970defc1d8b: Verifying Checksum 2022-11-23T01:13:22.9298127Z 5970defc1d8b: Download complete 2022-11-23T01:13:23.0281187Z 7e2d6313145f: Verifying Checksum 2022-11-23T01:13:23.0281562Z 7e2d6313145f: Download complete 2022-11-23T01:13:23.1890145Z 081025f05026: Verifying Checksum 2022-11-23T01:13:23.1890515Z 081025f05026: Download complete 2022-11-23T01:13:23.2826783Z 75f1ead35ace: Verifying Checksum 2022-11-23T01:13:23.2827107Z 75f1ead35ace: Download complete 2022-11-23T01:13:23.3646113Z 793c37004dab: Verifying Checksum 2022-11-23T01:13:23.3646431Z 793c37004dab: Download complete 2022-11-23T01:13:23.4439588Z 4f1313d71da0: Verifying Checksum 2022-11-23T01:13:23.4439970Z 4f1313d71da0: Download complete 2022-11-23T01:13:23.5452778Z 6386b2adbe28: Verifying Checksum 2022-11-23T01:13:23.5453177Z 6386b2adbe28: Download complete 2022-11-23T01:13:23.7310441Z f21c39caec7a: Verifying Checksum 2022-11-23T01:13:23.7310819Z f21c39caec7a: Download complete 2022-11-23T01:13:23.8013689Z 2baaae32e3d5: Verifying Checksum 2022-11-23T01:13:23.8014080Z 2baaae32e3d5: Download complete 2022-11-23T01:13:24.4020017Z 128ebe6811c5: Verifying Checksum 2022-11-23T01:13:24.4020353Z 128ebe6811c5: Download complete 2022-11-23T01:13:24.4768303Z 772fa4efddc3: Verifying Checksum 2022-11-23T01:13:24.4768677Z 772fa4efddc3: Download complete 2022-11-23T01:13:27.8132655Z 4b4d66451d67: Verifying Checksum 2022-11-23T01:13:27.8133027Z 4b4d66451d67: Download complete 2022-11-23T01:13:27.9014267Z 85cc9b957510: Verifying Checksum 2022-11-23T01:13:27.9014621Z 85cc9b957510: Download complete 2022-11-23T01:13:27.9776250Z 9ab0826d88b5: Download complete 2022-11-23T01:13:31.1112465Z 722cddd823f9: Verifying Checksum 2022-11-23T01:13:31.1112819Z 722cddd823f9: Download complete 2022-11-23T01:13:32.1900468Z 3a4c0b092a9e: Pull complete 2022-11-23T01:13:32.3040566Z d9e19734e968: Pull complete 2022-11-23T01:13:47.4690972Z c36ac12376c5: Download complete 2022-11-23T01:13:53.9083548Z 8859cc6b3ab7: Pull complete 2022-11-23T01:13:55.7865054Z 1a4b44db2103: Pull complete 2022-11-23T01:13:57.6933949Z fac00e927cfe: Pull complete 2022-11-23T01:14:05.8401012Z 2133f4081ddb: Pull complete 2022-11-23T01:14:07.7185517Z 3ceac802dd07: Pull complete 2022-11-23T01:14:09.8430041Z 69c929400d35: Pull complete 2022-11-23T01:14:12.2637342Z bc2be817cb7e: Pull complete 2022-11-23T01:14:16.5965561Z d04926f1c68d: Pull complete 2022-11-23T01:14:18.9543364Z 91f116f19c0d: Pull complete 2022-11-23T01:14:20.8646970Z a7cf5b3894f8: Pull complete 2022-11-23T01:14:56.6886681Z 722cddd823f9: Pull complete 2022-11-23T01:14:56.8199880Z 2e882087d824: Pull complete 2022-11-23T01:14:56.9479519Z ba6235196410: Pull complete 2022-11-23T01:14:57.0783633Z 313c5ee380ab: Pull complete 2022-11-23T01:14:57.2045726Z 6ff0fc00b0a9: Pull complete 2022-11-23T01:14:57.3485364Z 06f043acdbfb: Pull complete 2022-11-23T01:14:57.4713106Z a8c562f6a1cf: Pull complete 2022-11-23T01:14:59.7875556Z 4468cd4f574c: Pull complete 2022-11-23T01:14:59.8765299Z 46e5d3cec398: Pull complete 2022-11-23T01:14:59.9934110Z 165006759af3: Pull complete 2022-11-23T01:15:00.1349060Z 2c660101e145: Pull complete 2022-11-23T01:15:00.2460172Z 8ce731125a4e: Pull complete 2022-11-23T01:15:00.3608778Z 102ddcd90753: Pull complete 2022-11-23T01:15:08.5210846Z 0cff5716a932: Pull complete 2022-11-23T01:15:10.3941136Z 867a6ccb577f: Pull complete 2022-11-23T01:15:12.2665707Z 863c35620b44: Pull complete 2022-11-23T01:15:15.1145030Z f828f00c6d66: Pull complete 2022-11-23T01:15:16.9665402Z f0fe61569b0b: Pull complete 2022-11-23T01:15:19.9735381Z a9f3d4742233: Pull complete 2022-11-23T01:15:23.6852893Z 000b6751ea6f: Pull complete 2022-11-23T01:15:27.2291460Z 023a41fa48e6: Pull complete 2022-11-23T01:15:31.9730089Z 081025f05026: Pull complete 2022-11-23T01:15:34.0479181Z 5970defc1d8b: Pull complete 2022-11-23T01:15:36.4349606Z 7e2d6313145f: Pull complete 2022-11-23T01:15:44.9763967Z 4b4d66451d67: Pull complete 2022-11-23T01:15:46.8456617Z 75f1ead35ace: Pull complete 2022-11-23T01:15:48.7254340Z 793c37004dab: Pull complete 2022-11-23T01:15:50.6340616Z 4f1313d71da0: Pull complete 2022-11-23T01:15:52.3809870Z 6386b2adbe28: Pull complete 2022-11-23T01:15:53.1788194Z f21c39caec7a: Pull complete 2022-11-23T01:15:53.2805130Z 2baaae32e3d5: Pull complete 2022-11-23T01:15:55.2078148Z 128ebe6811c5: Pull complete 2022-11-23T01:15:55.3152344Z 772fa4efddc3: Pull complete 2022-11-23T01:16:23.7912807Z c36ac12376c5: Pull complete 2022-11-23T01:16:23.8963370Z 85cc9b957510: Pull complete 2022-11-23T01:16:24.3995678Z 9ab0826d88b5: Pull complete 2022-11-23T01:16:24.4111986Z Digest: sha256:a44ece0129de4f14f08fcb1423a34c97f2f88e2969bfc5ad6f33b15b4dfcfea3 2022-11-23T01:16:24.4143208Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:16:24.4173778Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:16:24.4276528Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2022-11-23T01:16:24.4276869Z with: 2022-11-23T01:16:24.4277118Z driver-version: 515.76 2022-11-23T01:16:24.4277361Z env: 2022-11-23T01:16:24.4277583Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:16:24.4277844Z ##[endgroup] 2022-11-23T01:16:24.4311449Z ##[group]Run nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767 2022-11-23T01:16:24.4311755Z with: 2022-11-23T01:16:24.4311995Z timeout_minutes: 10 2022-11-23T01:16:24.4312246Z max_attempts: 3 2022-11-23T01:16:24.4318820Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y nvidia-docker2 sudo systemctl restart docker ) } install_nvidia_driver_amzn2() { ( set -x # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}" 2022-11-23T01:16:24.4324935Z retry_wait_seconds: 10 2022-11-23T01:16:24.4325248Z polling_interval_seconds: 1 2022-11-23T01:16:24.4325538Z warning_on_retry: true 2022-11-23T01:16:24.4325812Z continue_on_error: false 2022-11-23T01:16:24.4326045Z env: 2022-11-23T01:16:24.4326303Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:16:24.4326582Z DRIVER_VERSION: 515.76 2022-11-23T01:16:24.4326821Z ##[endgroup] 2022-11-23T01:16:24.4889281Z 2022-11-23T01:16:24.4911307Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:16:24.4953585Z == Installing nvidia driver NVIDIA-Linux-x86_64-515.76.run == 2022-11-23T01:16:24.4955716Z + sudo yum remove -y nvidia-driver-latest-dkms 2022-11-23T01:16:25.0484166Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:16:25.1086886Z No Match for argument: nvidia-driver-latest-dkms 2022-11-23T01:16:25.1469529Z No Packages marked for removal 2022-11-23T01:16:25.1634858Z + echo 'Before installing NVIDIA driver' 2022-11-23T01:16:25.1635875Z + lspci 2022-11-23T01:16:25.1636327Z Before installing NVIDIA driver 2022-11-23T01:16:25.1863324Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 2022-11-23T01:16:25.1864084Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2022-11-23T01:16:25.1864672Z 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 2022-11-23T01:16:25.1865295Z 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01) 2022-11-23T01:16:25.1866152Z 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 2022-11-23T01:16:25.1866886Z 00:03.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2022-11-23T01:16:25.1870925Z 00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:16:25.1871678Z 00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:16:25.1872364Z 00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 2022-11-23T01:16:25.1872849Z + lsmod 2022-11-23T01:16:25.1886600Z Module Size Used by 2022-11-23T01:16:25.1887109Z nvidia_modeset 1142784 0 2022-11-23T01:16:25.1887556Z nvidia_uvm 1269760 0 2022-11-23T01:16:25.1887966Z veth 16384 0 2022-11-23T01:16:25.1888437Z nvidia 40808448 15 nvidia_uvm,nvidia_modeset 2022-11-23T01:16:25.1888911Z drm 425984 1 nvidia 2022-11-23T01:16:25.1889341Z i2c_core 77824 2 nvidia,drm 2022-11-23T01:16:25.1889820Z backlight 16384 1 nvidia_modeset 2022-11-23T01:16:25.1890294Z xt_conntrack 16384 1 2022-11-23T01:16:25.1890758Z ipt_MASQUERADE 16384 1 2022-11-23T01:16:25.1891192Z nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE 2022-11-23T01:16:25.1891708Z nf_conntrack_netlink 49152 0 2022-11-23T01:16:25.1892212Z nfnetlink 16384 2 nf_conntrack_netlink 2022-11-23T01:16:25.1893002Z xfrm_user 45056 1 2022-11-23T01:16:25.1893486Z xfrm_algo 16384 1 xfrm_user 2022-11-23T01:16:25.1893939Z xt_addrtype 16384 2 2022-11-23T01:16:25.1894380Z iptable_filter 16384 1 2022-11-23T01:16:25.1894785Z iptable_nat 16384 1 2022-11-23T01:16:25.1895231Z nf_conntrack_ipv4 16384 3 2022-11-23T01:16:25.1895732Z nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 2022-11-23T01:16:25.1896245Z nf_nat_ipv4 16384 1 iptable_nat 2022-11-23T01:16:25.1896875Z nf_nat 36864 2 nf_nat_masquerade_ipv4,nf_nat_ipv4 2022-11-23T01:16:25.1897594Z nf_conntrack 155648 7 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink 2022-11-23T01:16:25.1898242Z br_netfilter 24576 0 2022-11-23T01:16:25.1898677Z bridge 172032 1 br_netfilter 2022-11-23T01:16:25.1899125Z stp 16384 1 bridge 2022-11-23T01:16:25.1899573Z llc 16384 2 bridge,stp 2022-11-23T01:16:25.1900023Z overlay 86016 0 2022-11-23T01:16:25.1900439Z sunrpc 393216 1 2022-11-23T01:16:25.1900856Z dm_mirror 28672 0 2022-11-23T01:16:25.1901308Z dm_region_hash 20480 1 dm_mirror 2022-11-23T01:16:25.1901811Z dm_log 20480 2 dm_region_hash,dm_mirror 2022-11-23T01:16:25.1902297Z dm_mod 143360 2 dm_log,dm_mirror 2022-11-23T01:16:25.1902724Z dax 69632 1 dm_mod 2022-11-23T01:16:25.1903145Z sb_edac 24576 0 2022-11-23T01:16:25.1903560Z crc32_pclmul 16384 0 2022-11-23T01:16:25.1904012Z ghash_clmulni_intel 16384 0 2022-11-23T01:16:25.1904422Z pcbc 16384 0 2022-11-23T01:16:25.1904865Z aesni_intel 188416 0 2022-11-23T01:16:25.1905290Z ata_piix 36864 0 2022-11-23T01:16:25.1905717Z aes_x86_64 20480 1 aesni_intel 2022-11-23T01:16:25.1906163Z libata 266240 1 ata_piix 2022-11-23T01:16:25.1906665Z crypto_simd 16384 1 aesni_intel 2022-11-23T01:16:25.1907105Z glue_helper 16384 1 aesni_intel 2022-11-23T01:16:25.1907526Z pcc_cpufreq 16384 0 2022-11-23T01:16:25.1908078Z cryptd 28672 3 crypto_simd,ghash_clmulni_intel,aesni_intel 2022-11-23T01:16:25.1908610Z mousedev 24576 0 2022-11-23T01:16:25.1909040Z scsi_mod 245760 1 libata 2022-11-23T01:16:25.1909520Z evdev 20480 3 2022-11-23T01:16:25.1909931Z psmouse 32768 0 2022-11-23T01:16:25.1910331Z button 16384 0 2022-11-23T01:16:25.1910740Z ena 114688 0 2022-11-23T01:16:25.1911257Z xen_blkfront 49152 2 2022-11-23T01:16:25.1911897Z crc32c_intel 24576 0 2022-11-23T01:16:25.1912444Z autofs4 49152 2 2022-11-23T01:16:25.1912856Z + modinfo nvidia 2022-11-23T01:16:25.1913600Z filename: /lib/modules/4.14.252-195.483.amzn2.x86_64/kernel/drivers/video/nvidia.ko 2022-11-23T01:16:25.1914149Z firmware: nvidia/515.76/gsp.bin 2022-11-23T01:16:25.1914689Z alias: char-major-195-* 2022-11-23T01:16:25.1915697Z version: 515.76 2022-11-23T01:16:25.1916112Z supported: external 2022-11-23T01:16:25.1916497Z license: NVIDIA 2022-11-23T01:16:25.1916963Z srcversion: 51FD9DD90150B35351AFFBB 2022-11-23T01:16:25.1917422Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2022-11-23T01:16:25.1917927Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2022-11-23T01:16:25.1918436Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2022-11-23T01:16:25.1918982Z depends: i2c-core,drm 2022-11-23T01:16:25.1919420Z retpoline: Y 2022-11-23T01:16:25.1919858Z name: nvidia 2022-11-23T01:16:25.1920540Z vermagic: 4.14.252-195.483.amzn2.x86_64 SMP mod_unload modversions 2022-11-23T01:16:25.1921138Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2022-11-23T01:16:25.1921751Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2022-11-23T01:16:25.1922328Z parm: NVreg_ResmanDebugLevel:int 2022-11-23T01:16:25.1922819Z parm: NVreg_RmLogonRC:int 2022-11-23T01:16:25.1923282Z parm: NVreg_ModifyDeviceFiles:int 2022-11-23T01:16:25.1923779Z parm: NVreg_DeviceFileUID:int 2022-11-23T01:16:25.1924259Z parm: NVreg_DeviceFileGID:int 2022-11-23T01:16:25.1924743Z parm: NVreg_DeviceFileMode:int 2022-11-23T01:16:25.1925304Z parm: NVreg_InitializeSystemMemoryAllocations:int 2022-11-23T01:16:25.1925883Z parm: NVreg_UsePageAttributeTable:int 2022-11-23T01:16:25.1926376Z parm: NVreg_EnablePCIeGen3:int 2022-11-23T01:16:25.1926862Z parm: NVreg_EnableMSI:int 2022-11-23T01:16:25.1927339Z parm: NVreg_TCEBypassMode:int 2022-11-23T01:16:25.1927863Z parm: NVreg_EnableStreamMemOPs:int 2022-11-23T01:16:25.1928426Z parm: NVreg_RestrictProfilingToAdminUsers:int 2022-11-23T01:16:25.1929033Z parm: NVreg_PreserveVideoMemoryAllocations:int 2022-11-23T01:16:25.1929632Z parm: NVreg_EnableS0ixPowerManagement:int 2022-11-23T01:16:25.1930253Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2022-11-23T01:16:25.1930861Z parm: NVreg_DynamicPowerManagement:int 2022-11-23T01:16:25.1931506Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2022-11-23T01:16:25.1932137Z parm: NVreg_EnableGpuFirmware:int 2022-11-23T01:16:25.1932652Z parm: NVreg_EnableGpuFirmwareLogs:int 2022-11-23T01:16:25.1933229Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2022-11-23T01:16:25.1933827Z parm: NVreg_EnableUserNUMAManagement:int 2022-11-23T01:16:25.1934367Z parm: NVreg_MemoryPoolSize:int 2022-11-23T01:16:25.1934866Z parm: NVreg_KMallocHeapMaxSize:int 2022-11-23T01:16:25.1935411Z parm: NVreg_VMallocHeapMaxSize:int 2022-11-23T01:16:25.1935891Z parm: NVreg_IgnoreMMIOCheck:int 2022-11-23T01:16:25.1936374Z parm: NVreg_NvLinkDisable:int 2022-11-23T01:16:25.1936920Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2022-11-23T01:16:25.1937493Z parm: NVreg_RegisterPCIDriver:int 2022-11-23T01:16:25.1938004Z parm: NVreg_EnableDbgBreakpoint:int 2022-11-23T01:16:25.1938530Z parm: NVreg_RegistryDwords:charp 2022-11-23T01:16:25.1939070Z parm: NVreg_RegistryDwordsPerDevice:charp 2022-11-23T01:16:25.1939550Z parm: NVreg_RmMsg:charp 2022-11-23T01:16:25.1940033Z parm: NVreg_GpuBlacklist:charp 2022-11-23T01:16:25.1940551Z parm: NVreg_TemporaryFilePath:charp 2022-11-23T01:16:25.1941029Z parm: NVreg_ExcludedGpus:charp 2022-11-23T01:16:25.1941718Z parm: NVreg_DmaRemapPeerMmio:int 2022-11-23T01:16:25.1942198Z parm: rm_firmware_active:charp 2022-11-23T01:16:25.1942788Z + HAS_NVIDIA_DRIVER=0 2022-11-23T01:16:25.1943266Z ++ command -v nvidia-smi 2022-11-23T01:16:25.1943736Z + '[' -x /usr/bin/nvidia-smi ']' 2022-11-23T01:16:25.1944095Z + set +e 2022-11-23T01:16:25.1944642Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2022-11-23T01:16:27.6264711Z + INSTALLED_DRIVER_VERSION=515.76 2022-11-23T01:16:27.6265054Z + NVIDIA_SMI_STATUS=0 2022-11-23T01:16:27.6265490Z + '[' 0 -ne 0 ']' 2022-11-23T01:16:27.6265785Z + '[' 515.76 '!=' 515.76 ']' 2022-11-23T01:16:27.6266037Z + HAS_NVIDIA_DRIVER=1 2022-11-23T01:16:27.6266501Z + echo 'NVIDIA driver (515.76) has already been installed. Skipping NVIDIA driver installation' 2022-11-23T01:16:27.6266868Z + set -e 2022-11-23T01:16:27.6267140Z + '[' 1 -eq 0 ']' 2022-11-23T01:16:27.6267375Z + sudo modprobe nvidia 2022-11-23T01:16:27.6267733Z NVIDIA driver (515.76) has already been installed. Skipping NVIDIA driver installation 2022-11-23T01:16:27.6415256Z + echo 'After installing NVIDIA driver' 2022-11-23T01:16:27.6415557Z + lspci 2022-11-23T01:16:27.6415794Z After installing NVIDIA driver 2022-11-23T01:16:27.6627430Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 2022-11-23T01:16:27.6627883Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2022-11-23T01:16:27.6628286Z 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 2022-11-23T01:16:27.6628684Z 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01) 2022-11-23T01:16:27.6629056Z 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 2022-11-23T01:16:27.6629429Z 00:03.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2022-11-23T01:16:27.6629862Z 00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:16:27.6630294Z 00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:16:27.6630730Z 00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 2022-11-23T01:16:27.6631038Z + lsmod 2022-11-23T01:16:27.6649592Z Module Size Used by 2022-11-23T01:16:27.6649884Z nvidia_modeset 1142784 0 2022-11-23T01:16:27.6650206Z nvidia_uvm 1269760 0 2022-11-23T01:16:27.6650677Z veth 16384 0 2022-11-23T01:16:27.6651081Z nvidia 40808448 27 nvidia_uvm,nvidia_modeset 2022-11-23T01:16:27.6651448Z drm 425984 1 nvidia 2022-11-23T01:16:27.6651775Z i2c_core 77824 2 nvidia,drm 2022-11-23T01:16:27.6652098Z backlight 16384 1 nvidia_modeset 2022-11-23T01:16:27.6652490Z xt_conntrack 16384 1 2022-11-23T01:16:27.6652826Z ipt_MASQUERADE 16384 1 2022-11-23T01:16:27.6653137Z nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE 2022-11-23T01:16:27.6653507Z nf_conntrack_netlink 49152 0 2022-11-23T01:16:27.6653920Z nfnetlink 16384 2 nf_conntrack_netlink 2022-11-23T01:16:27.6654230Z xfrm_user 45056 1 2022-11-23T01:16:27.6654616Z xfrm_algo 16384 1 xfrm_user 2022-11-23T01:16:27.6654959Z xt_addrtype 16384 2 2022-11-23T01:16:27.6655237Z iptable_filter 16384 1 2022-11-23T01:16:27.6655593Z iptable_nat 16384 1 2022-11-23T01:16:27.6655913Z nf_conntrack_ipv4 16384 3 2022-11-23T01:16:27.6656235Z nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 2022-11-23T01:16:27.6656597Z nf_nat_ipv4 16384 1 iptable_nat 2022-11-23T01:16:27.6657006Z nf_nat 36864 2 nf_nat_masquerade_ipv4,nf_nat_ipv4 2022-11-23T01:16:27.6657547Z nf_conntrack 155648 7 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink 2022-11-23T01:16:27.6657978Z br_netfilter 24576 0 2022-11-23T01:16:27.6658307Z bridge 172032 1 br_netfilter 2022-11-23T01:16:27.6658635Z stp 16384 1 bridge 2022-11-23T01:16:27.6659147Z llc 16384 2 bridge,stp 2022-11-23T01:16:27.6659586Z overlay 86016 0 2022-11-23T01:16:27.6659915Z sunrpc 393216 1 2022-11-23T01:16:27.6660231Z dm_mirror 28672 0 2022-11-23T01:16:27.6660521Z dm_region_hash 20480 1 dm_mirror 2022-11-23T01:16:27.6660961Z dm_log 20480 2 dm_region_hash,dm_mirror 2022-11-23T01:16:27.6661329Z dm_mod 143360 2 dm_log,dm_mirror 2022-11-23T01:16:27.6661619Z dax 69632 1 dm_mod 2022-11-23T01:16:27.6661931Z sb_edac 24576 0 2022-11-23T01:16:27.6662297Z crc32_pclmul 16384 0 2022-11-23T01:16:27.6662580Z ghash_clmulni_intel 16384 0 2022-11-23T01:16:27.6662897Z pcbc 16384 0 2022-11-23T01:16:27.6663210Z aesni_intel 188416 0 2022-11-23T01:16:27.6663514Z ata_piix 36864 0 2022-11-23T01:16:27.6663887Z aes_x86_64 20480 1 aesni_intel 2022-11-23T01:16:27.6664223Z libata 266240 1 ata_piix 2022-11-23T01:16:27.6664525Z crypto_simd 16384 1 aesni_intel 2022-11-23T01:16:27.6664892Z glue_helper 16384 1 aesni_intel 2022-11-23T01:16:27.6665253Z pcc_cpufreq 16384 0 2022-11-23T01:16:27.6665605Z cryptd 28672 3 crypto_simd,ghash_clmulni_intel,aesni_intel 2022-11-23T01:16:27.6665986Z mousedev 24576 0 2022-11-23T01:16:27.6666311Z scsi_mod 245760 1 libata 2022-11-23T01:16:27.6666586Z evdev 20480 3 2022-11-23T01:16:27.6666919Z psmouse 32768 0 2022-11-23T01:16:27.6667237Z button 16384 0 2022-11-23T01:16:27.6667547Z ena 114688 0 2022-11-23T01:16:27.6667811Z xen_blkfront 49152 2 2022-11-23T01:16:27.6668121Z crc32c_intel 24576 0 2022-11-23T01:16:27.6668459Z autofs4 49152 2 2022-11-23T01:16:27.6668718Z + modinfo nvidia 2022-11-23T01:16:27.6669262Z filename: /lib/modules/4.14.252-195.483.amzn2.x86_64/kernel/drivers/video/nvidia.ko 2022-11-23T01:16:27.6669709Z firmware: nvidia/515.76/gsp.bin 2022-11-23T01:16:27.6670067Z alias: char-major-195-* 2022-11-23T01:16:27.6670455Z version: 515.76 2022-11-23T01:16:27.6670775Z supported: external 2022-11-23T01:16:27.6671045Z license: NVIDIA 2022-11-23T01:16:27.6671372Z srcversion: 51FD9DD90150B35351AFFBB 2022-11-23T01:16:27.6671770Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2022-11-23T01:16:27.6672151Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2022-11-23T01:16:27.6672472Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2022-11-23T01:16:27.6700736Z depends: i2c-core,drm 2022-11-23T01:16:27.6701010Z retpoline: Y 2022-11-23T01:16:27.6701264Z name: nvidia 2022-11-23T01:16:27.6701679Z vermagic: 4.14.252-195.483.amzn2.x86_64 SMP mod_unload modversions 2022-11-23T01:16:27.6702051Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2022-11-23T01:16:27.6702467Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2022-11-23T01:16:27.6702841Z parm: NVreg_ResmanDebugLevel:int 2022-11-23T01:16:27.6703126Z parm: NVreg_RmLogonRC:int 2022-11-23T01:16:27.6703435Z parm: NVreg_ModifyDeviceFiles:int 2022-11-23T01:16:27.6703742Z parm: NVreg_DeviceFileUID:int 2022-11-23T01:16:27.6704046Z parm: NVreg_DeviceFileGID:int 2022-11-23T01:16:27.6704330Z parm: NVreg_DeviceFileMode:int 2022-11-23T01:16:27.6704693Z parm: NVreg_InitializeSystemMemoryAllocations:int 2022-11-23T01:16:27.6705072Z parm: NVreg_UsePageAttributeTable:int 2022-11-23T01:16:27.6705382Z parm: NVreg_EnablePCIeGen3:int 2022-11-23T01:16:27.6705695Z parm: NVreg_EnableMSI:int 2022-11-23T01:16:27.6705987Z parm: NVreg_TCEBypassMode:int 2022-11-23T01:16:27.6706289Z parm: NVreg_EnableStreamMemOPs:int 2022-11-23T01:16:27.6706647Z parm: NVreg_RestrictProfilingToAdminUsers:int 2022-11-23T01:16:27.6707205Z parm: NVreg_PreserveVideoMemoryAllocations:int 2022-11-23T01:16:27.6707633Z parm: NVreg_EnableS0ixPowerManagement:int 2022-11-23T01:16:27.6708055Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2022-11-23T01:16:27.6708448Z parm: NVreg_DynamicPowerManagement:int 2022-11-23T01:16:27.6708863Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2022-11-23T01:16:27.6709243Z parm: NVreg_EnableGpuFirmware:int 2022-11-23T01:16:27.6709574Z parm: NVreg_EnableGpuFirmwareLogs:int 2022-11-23T01:16:27.6709926Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2022-11-23T01:16:27.6710274Z parm: NVreg_EnableUserNUMAManagement:int 2022-11-23T01:16:27.6710597Z parm: NVreg_MemoryPoolSize:int 2022-11-23T01:16:27.6710910Z parm: NVreg_KMallocHeapMaxSize:int 2022-11-23T01:16:27.6711223Z parm: NVreg_VMallocHeapMaxSize:int 2022-11-23T01:16:27.6711537Z parm: NVreg_IgnoreMMIOCheck:int 2022-11-23T01:16:27.6711843Z parm: NVreg_NvLinkDisable:int 2022-11-23T01:16:27.6712179Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2022-11-23T01:16:27.6712532Z parm: NVreg_RegisterPCIDriver:int 2022-11-23T01:16:27.6712859Z parm: NVreg_EnableDbgBreakpoint:int 2022-11-23T01:16:27.6713167Z parm: NVreg_RegistryDwords:charp 2022-11-23T01:16:27.6713510Z parm: NVreg_RegistryDwordsPerDevice:charp 2022-11-23T01:16:27.6713829Z parm: NVreg_RmMsg:charp 2022-11-23T01:16:27.6714121Z parm: NVreg_GpuBlacklist:charp 2022-11-23T01:16:27.6714426Z parm: NVreg_TemporaryFilePath:charp 2022-11-23T01:16:27.6714743Z parm: NVreg_ExcludedGpus:charp 2022-11-23T01:16:27.6715410Z parm: NVreg_DmaRemapPeerMmio:int 2022-11-23T01:16:27.6715809Z parm: rm_firmware_active:charp 2022-11-23T01:16:27.6716076Z + set +e 2022-11-23T01:16:27.6716356Z + nvidia-smi 2022-11-23T01:16:27.6872355Z Wed Nov 23 01:16:27 2022 2022-11-23T01:16:27.6872828Z +-----------------------------------------------------------------------------+ 2022-11-23T01:16:27.6873346Z | NVIDIA-SMI 515.76 Driver Version: 515.76 CUDA Version: 11.7 | 2022-11-23T01:16:27.6873816Z |-------------------------------+----------------------+----------------------+ 2022-11-23T01:16:27.6874335Z | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 2022-11-23T01:16:27.6874829Z | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | 2022-11-23T01:16:27.6875575Z | | | MIG M. | 2022-11-23T01:16:27.6875869Z |===============================+======================+======================| 2022-11-23T01:16:27.6929526Z | 0 Tesla M60 Off | 00000000:00:1D.0 Off | 10560238342 | 2022-11-23T01:16:27.6929889Z | N/A 32C P0 38W / 150W | 0MiB / 7680MiB | 0% Default | 2022-11-23T01:16:27.6930210Z | | | N/A | 2022-11-23T01:16:27.6930674Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:16:27.7008458Z | 1 Tesla M60 Off | 00000000:00:1E.0 Off | 8589934590 | 2022-11-23T01:16:27.7008804Z | N/A 39C P0 38W / 150W | 0MiB / 7680MiB | 64% Default | 2022-11-23T01:16:27.7009134Z | | | N/A | 2022-11-23T01:16:27.7009576Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:16:27.7009946Z 2022-11-23T01:16:27.7010378Z +-----------------------------------------------------------------------------+ 2022-11-23T01:16:27.7010842Z | Processes: | 2022-11-23T01:16:27.7011195Z | GPU GI CI PID Type Process name GPU Memory | 2022-11-23T01:16:27.7011758Z | ID ID Usage | 2022-11-23T01:16:27.7012080Z |=============================================================================| 2022-11-23T01:16:27.7017239Z | No running processes found | 2022-11-23T01:16:27.7018195Z +-----------------------------------------------------------------------------+ 2022-11-23T01:16:27.7565743Z + NVIDIA_SMI_STATUS=0 2022-11-23T01:16:27.7566247Z + '[' 0 -eq 0 ']' 2022-11-23T01:16:27.7566955Z + echo 'INFO: Ignoring allowed status 0' 2022-11-23T01:16:27.7567591Z + set -e 2022-11-23T01:16:27.7568008Z INFO: Ignoring allowed status 0 2022-11-23T01:16:27.7572154Z == Installing nvidia container toolkit for amzn2 == 2022-11-23T01:16:27.7576694Z + sudo yum install -y yum-utils 2022-11-23T01:16:28.3079229Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:16:30.0293149Z Package yum-utils-1.1.31-46.amzn2.0.1.noarch already installed and latest version 2022-11-23T01:16:30.0293654Z Nothing to do 2022-11-23T01:16:30.1142627Z + sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-11-23T01:16:30.6904175Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:16:30.7209853Z adding repo from: https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-11-23T01:16:30.7210924Z grabbing file https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo to /etc/yum.repos.d/nvidia-docker.repo 2022-11-23T01:16:30.7211470Z repo saved to /etc/yum.repos.d/nvidia-docker.repo 2022-11-23T01:16:30.7362852Z + sudo yum install -y nvidia-docker2 2022-11-23T01:16:31.2832156Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:16:32.6210291Z Package nvidia-docker2-2.11.0-1.noarch already installed and latest version 2022-11-23T01:16:32.6210874Z Nothing to do 2022-11-23T01:16:32.7024553Z + sudo systemctl restart docker 2022-11-23T01:16:58.5296139Z Command completed after 1 attempt(s). 2022-11-23T01:16:58.5296652Z 2022-11-23T01:16:58.5301702Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:16:58.5343727Z ##[group]Run python3 -m pip install psutil==5.9.1 2022-11-23T01:16:58.5344147Z python3 -m pip install psutil==5.9.1 2022-11-23T01:16:58.5344467Z python3 -m pip install pynvml==11.4.1 2022-11-23T01:16:58.5344832Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2022-11-23T01:16:58.5345223Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2022-11-23T01:16:58.5359090Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:16:58.5359403Z env: 2022-11-23T01:16:58.5359654Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:16:58.5359926Z GPU_FLAG: --gpus all 2022-11-23T01:16:58.5360184Z ##[endgroup] 2022-11-23T01:16:58.8347776Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T01:16:58.8578940Z Requirement already satisfied: psutil==5.9.1 in /home/ec2-user/.local/lib/python3.7/site-packages (5.9.1) 2022-11-23T01:16:59.4456912Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T01:16:59.4695167Z Requirement already satisfied: pynvml==11.4.1 in /home/ec2-user/.local/lib/python3.7/site-packages (11.4.1) 2022-11-23T01:16:59.7700682Z Prepare all required actions 2022-11-23T01:16:59.7701068Z Getting action download info 2022-11-23T01:16:59.9257209Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:4a8bfae15cc25cc0785c1603ee87a9da8fd442ea) 2022-11-23T01:17:00.1814216Z Download action repository 'actions/download-artifact@v3' (SHA:9782bd6a9848b53b110e712e20e42d89988822b7) 2022-11-23T01:17:00.5192811Z ##[group]Run ./.github/actions/download-build-artifacts 2022-11-23T01:17:00.5193258Z with: 2022-11-23T01:17:00.5193526Z name: linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T01:17:00.5193818Z env: 2022-11-23T01:17:00.5194058Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:17:00.5194329Z GPU_FLAG: --gpus all 2022-11-23T01:17:00.5194564Z ##[endgroup] 2022-11-23T01:17:00.5225343Z ##[group]Run seemethere/download-artifact-s3@v4 2022-11-23T01:17:00.5225622Z with: 2022-11-23T01:17:00.5225907Z name: linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T01:17:00.5226221Z s3-bucket: gha-artifacts 2022-11-23T01:17:00.5226541Z region: us-east-1 2022-11-23T01:17:00.5226767Z env: 2022-11-23T01:17:00.5227010Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:17:00.5227279Z GPU_FLAG: --gpus all 2022-11-23T01:17:00.5227511Z ##[endgroup] 2022-11-23T01:17:01.0478442Z Found 1 objects with prefix pytorch/pytorch/3528293554/linux-bionic-cuda11.7-py3.10-gcc7/ 2022-11-23T01:17:01.0479050Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-11-23T01:17:08.1603124Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-11-23T01:17:08.1603753Z 2022-11-23T01:17:08.1608763Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:17:08.1610065Z Artifact download has finished successfully 2022-11-23T01:17:08.1922624Z ##[group]Run unzip -o artifacts.zip 2022-11-23T01:17:08.1922957Z unzip -o artifacts.zip 2022-11-23T01:17:08.1936180Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:17:08.1936464Z env: 2022-11-23T01:17:08.1936712Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:17:08.1936986Z GPU_FLAG: --gpus all 2022-11-23T01:17:08.1937222Z ##[endgroup] 2022-11-23T01:17:08.1980914Z Archive: artifacts.zip 2022-11-23T01:17:08.1982890Z creating: dist/ 2022-11-23T01:17:10.2670412Z inflating: dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:17:10.2670897Z creating: build/custom_test_artifacts/ 2022-11-23T01:17:10.2671332Z creating: build/custom_test_artifacts/custom-op-build/ 2022-11-23T01:17:10.2671792Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2022-11-23T01:17:10.2678493Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:17:10.2679055Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/ 2022-11-23T01:17:10.2679630Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:17:10.2680188Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:17:10.2680752Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:17:10.2683014Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:17:10.2684149Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:17:10.2684706Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:17:10.2685284Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:17:10.2688089Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:17:10.2689330Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:17:10.2690759Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:17:10.2691808Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:17:10.2693771Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:17:10.2694951Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:17:10.2695586Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:17:10.2696164Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:17:10.2750094Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:17:10.2750835Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:17:10.2751570Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:17:10.2752305Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:17:10.2753050Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:17:10.2753771Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:17:10.2754474Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:17:10.2755484Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:17:10.2756198Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:17:10.2798477Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:17:10.2839833Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:17:10.2840855Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:17:10.2841557Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:17:10.2842408Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:17:10.2843316Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:17:10.2844104Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:17:10.2844965Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:17:10.2847023Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:17:10.2920682Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:17:10.2993821Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:17:10.2994467Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:17:10.2995277Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:17:10.2996194Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeError.log 2022-11-23T01:17:10.2996793Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2022-11-23T01:17:10.2997343Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2022-11-23T01:17:10.2997930Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2022-11-23T01:17:10.2998655Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2022-11-23T01:17:10.2999257Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2022-11-23T01:17:10.2999840Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2022-11-23T01:17:10.3000442Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2022-11-23T01:17:10.3001451Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2022-11-23T01:17:10.3002058Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2022-11-23T01:17:10.3002869Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2022-11-23T01:17:10.3003659Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2022-11-23T01:17:10.3024610Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2022-11-23T01:17:10.3139001Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2022-11-23T01:17:10.3139579Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2022-11-23T01:17:10.3140190Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2022-11-23T01:17:10.3140827Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2022-11-23T01:17:10.3141451Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2022-11-23T01:17:10.3142056Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2022-11-23T01:17:10.3142666Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2022-11-23T01:17:10.3143521Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2022-11-23T01:17:10.3144333Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2022-11-23T01:17:10.3144963Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2022-11-23T01:17:10.3145581Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2022-11-23T01:17:10.3166942Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2022-11-23T01:17:10.3250528Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2022-11-23T01:17:10.3251173Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:17:10.3251788Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:17:10.3252354Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2022-11-23T01:17:10.3253059Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2022-11-23T01:17:10.3254284Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2022-11-23T01:17:10.3254828Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2022-11-23T01:17:10.3257991Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2022-11-23T01:17:10.3258682Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2022-11-23T01:17:10.3259287Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2022-11-23T01:17:10.3352349Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2022-11-23T01:17:10.3416065Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2022-11-23T01:17:10.3416676Z creating: build/custom_test_artifacts/jit-hook-build/ 2022-11-23T01:17:10.3417143Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2022-11-23T01:17:10.3423719Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:17:10.3424264Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/ 2022-11-23T01:17:10.3424816Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:17:10.3425385Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:17:10.3425928Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:17:10.3428164Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:17:10.3429213Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:17:10.3429782Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:17:10.3430333Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:17:10.3433059Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:17:10.3434130Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:17:10.3436017Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:17:10.3436793Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:17:10.3438879Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:17:10.3439730Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:17:10.3440339Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:17:10.3440896Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:17:10.3495566Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:17:10.3496295Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:17:10.3497027Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:17:10.3497751Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:17:10.3498483Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:17:10.3499189Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:17:10.3499888Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:17:10.3500586Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:17:10.3501423Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:17:10.3543688Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:17:10.3585161Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:17:10.3586206Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:17:10.3586983Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:17:10.3587621Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:17:10.3588454Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:17:10.3589423Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:17:10.3590409Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:17:10.3592452Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:17:10.3665939Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:17:10.3739394Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:17:10.3740049Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:17:10.3740593Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:17:10.3741292Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeError.log 2022-11-23T01:17:10.3741855Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2022-11-23T01:17:10.3742402Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2022-11-23T01:17:10.3742981Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2022-11-23T01:17:10.3743739Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2022-11-23T01:17:10.3744348Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2022-11-23T01:17:10.3744956Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2022-11-23T01:17:10.3745689Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2022-11-23T01:17:10.3746785Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2022-11-23T01:17:10.3747454Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2022-11-23T01:17:10.3748163Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2022-11-23T01:17:10.3748772Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2022-11-23T01:17:10.3770179Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2022-11-23T01:17:10.3833471Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2022-11-23T01:17:10.3834121Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:17:10.3834713Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:17:10.3835519Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2022-11-23T01:17:10.3836554Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2022-11-23T01:17:10.3837761Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2022-11-23T01:17:10.3838284Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2022-11-23T01:17:10.3841138Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2022-11-23T01:17:10.3841930Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2022-11-23T01:17:10.3842754Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2022-11-23T01:17:10.3892016Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2022-11-23T01:17:10.3892505Z creating: build/custom_test_artifacts/custom-backend-build/ 2022-11-23T01:17:10.3893003Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2022-11-23T01:17:10.3899744Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:17:10.3900310Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/ 2022-11-23T01:17:10.3900895Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:17:10.3901482Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:17:10.3902053Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:17:10.3904016Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:17:10.3905212Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:17:10.3905806Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:17:10.3906406Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:17:10.3908975Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:17:10.3910103Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:17:10.3911752Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:17:10.3912553Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:17:10.3914240Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:17:10.3915508Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:17:10.3916148Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:17:10.3916750Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:17:10.3971155Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:17:10.3971902Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:17:10.3972662Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:17:10.3973444Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:17:10.3974202Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:17:10.3974918Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:17:10.3975771Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:17:10.3976525Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:17:10.3977258Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:17:10.4019426Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:17:10.4060987Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:17:10.4062079Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:17:10.4062983Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:17:10.4063639Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:17:10.4064454Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:17:10.4066141Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:17:10.4066937Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:17:10.4069016Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:17:10.4142611Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:17:10.4216165Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:17:10.4217421Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:17:10.4218013Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:17:10.4218737Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeError.log 2022-11-23T01:17:10.4219467Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2022-11-23T01:17:10.4220072Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2022-11-23T01:17:10.4220687Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2022-11-23T01:17:10.4221353Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2022-11-23T01:17:10.4221996Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2022-11-23T01:17:10.4222627Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2022-11-23T01:17:10.4223261Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2022-11-23T01:17:10.4224264Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2022-11-23T01:17:10.4225062Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2022-11-23T01:17:10.4225726Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2022-11-23T01:17:10.4226357Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2022-11-23T01:17:10.4231056Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2022-11-23T01:17:10.4378854Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2022-11-23T01:17:10.4379509Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2022-11-23T01:17:10.4380140Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2022-11-23T01:17:10.4380819Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2022-11-23T01:17:10.4381571Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2022-11-23T01:17:10.4382215Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2022-11-23T01:17:10.4382847Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2022-11-23T01:17:10.4383746Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2022-11-23T01:17:10.4384411Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2022-11-23T01:17:10.4385070Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2022-11-23T01:17:10.4385691Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2022-11-23T01:17:10.4406624Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2022-11-23T01:17:10.4465126Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2022-11-23T01:17:10.4465828Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:17:10.4466484Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:17:10.4467095Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2022-11-23T01:17:10.4467967Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2022-11-23T01:17:10.4469201Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2022-11-23T01:17:10.4469766Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2022-11-23T01:17:10.4473081Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2022-11-23T01:17:10.4473866Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2022-11-23T01:17:10.4474636Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2022-11-23T01:17:10.4593857Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2022-11-23T01:17:10.4639879Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2022-11-23T01:17:10.4640233Z creating: build/lib/ 2022-11-23T01:17:10.4640835Z inflating: build/lib/libclog.a 2022-11-23T01:17:10.4707169Z inflating: build/lib/libgtest.a 2022-11-23T01:17:10.4717660Z inflating: build/lib/libpthreadpool.a 2022-11-23T01:17:10.4823920Z inflating: build/lib/libprotobuf-lite.a 2022-11-23T01:17:10.4915417Z inflating: build/lib/libbenchmark.a 2022-11-23T01:17:10.4924777Z inflating: build/lib/libittnotify.a 2022-11-23T01:17:10.4956869Z inflating: build/lib/libtensorpipe_uv.a 2022-11-23T01:17:10.5033642Z inflating: build/lib/libasmjit.a 2022-11-23T01:17:10.5568819Z inflating: build/lib/libprotobuf.a 2022-11-23T01:17:10.5701782Z inflating: build/lib/libgloo.a 2022-11-23T01:17:10.5734299Z inflating: build/lib/libfmt.a 2022-11-23T01:17:10.5736182Z inflating: build/lib/libcaffe2_nvrtc.so 2022-11-23T01:17:10.5736779Z inflating: build/lib/libfoxi_loader.a 2022-11-23T01:17:10.5816648Z inflating: build/lib/libc10.so 2022-11-23T01:17:10.5818115Z inflating: build/lib/libtorch_global_deps.so 2022-11-23T01:17:10.5827975Z inflating: build/lib/libcpuinfo.a 2022-11-23T01:17:10.5837195Z inflating: build/lib/libcpuinfo_internals.a 2022-11-23T01:17:10.5839785Z inflating: build/lib/libnnpack_reference_layers.a 2022-11-23T01:17:10.6410833Z inflating: build/lib/libprotoc.a 2022-11-23T01:17:10.6429737Z inflating: build/lib/libgmock.a 2022-11-23T01:17:10.6430296Z inflating: build/lib/libgtest_main.a 2022-11-23T01:17:10.6431359Z inflating: build/lib/libbenchmark_main.a 2022-11-23T01:17:10.6573857Z inflating: build/lib/libXNNPACK.a 2022-11-23T01:17:11.6387597Z inflating: build/lib/libdnnl.a 2022-11-23T01:17:11.7043790Z inflating: build/lib/libtensorpipe.a 2022-11-23T01:17:11.7096847Z inflating: build/lib/libc10_cuda.so 2022-11-23T01:17:11.7112794Z inflating: build/lib/libqnnpack.a 2022-11-23T01:17:11.7113473Z inflating: build/lib/libgmock_main.a 2022-11-23T01:17:11.8659159Z inflating: build/lib/libfbgemm.a 2022-11-23T01:17:11.8682386Z inflating: build/lib/libpytorch_qnnpack.a 2022-11-23T01:17:11.9831050Z inflating: build/lib/libdnnl_graph.a 2022-11-23T01:17:12.0347926Z inflating: build/lib/libkineto.a 2022-11-23T01:17:12.0638468Z inflating: build/lib/libtensorpipe_cuda.a 2022-11-23T01:17:12.0684070Z inflating: build/lib/libcaffe2_protos.a 2022-11-23T01:17:12.0732354Z inflating: build/lib/libonnx_proto.a 2022-11-23T01:17:12.0754484Z inflating: build/lib/libnnpack.a 2022-11-23T01:17:12.1433521Z inflating: build/lib/libonnx.a 2022-11-23T01:17:12.1868263Z inflating: build/lib/libgloo_cuda.a 2022-11-23T01:17:14.5500586Z inflating: build/lib/libtorch_cpu.so 2022-11-23T01:17:16.6755016Z inflating: build/lib/libtorch_cuda.so 2022-11-23T01:17:16.6756030Z inflating: build/lib/libtorch.so 2022-11-23T01:17:16.6759128Z inflating: build/lib/libc10d_cuda_test.so 2022-11-23T01:17:17.6657776Z inflating: build/lib/libtorch_cuda_linalg.so 2022-11-23T01:17:17.6681592Z inflating: build/lib/libjitbackend_test.so 2022-11-23T01:17:17.6741726Z inflating: build/lib/libtorchbind_test.so 2022-11-23T01:17:17.6772686Z inflating: build/lib/libbackend_with_compiler.so 2022-11-23T01:17:17.6777369Z inflating: build/lib/libshm.so 2022-11-23T01:17:17.8583077Z inflating: build/lib/libtorch_python.so 2022-11-23T01:17:17.8622812Z inflating: build/lib/libnnapi_backend.so 2022-11-23T01:17:17.8623138Z creating: build/bin/ 2022-11-23T01:17:17.8675776Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2022-11-23T01:17:17.8730925Z inflating: build/bin/c10_DeviceGuard_test 2022-11-23T01:17:17.8784560Z inflating: build/bin/c10_Device_test 2022-11-23T01:17:17.8835906Z inflating: build/bin/c10_StreamGuard_test 2022-11-23T01:17:17.8898916Z inflating: build/bin/c10_DispatchKeySet_test 2022-11-23T01:17:17.8951232Z inflating: build/bin/c10_SymInt_test 2022-11-23T01:17:17.9011012Z inflating: build/bin/c10_InlineDeviceGuard_test 2022-11-23T01:17:17.9070502Z inflating: build/bin/c10_InlineStreamGuard_test 2022-11-23T01:17:17.9131550Z inflating: build/bin/c10_SizesAndStrides_test 2022-11-23T01:17:17.9182780Z inflating: build/bin/c10_Array_test 2022-11-23T01:17:17.9239533Z inflating: build/bin/c10_Bitset_test 2022-11-23T01:17:17.9294273Z inflating: build/bin/c10_C++17_test 2022-11-23T01:17:17.9345496Z inflating: build/bin/c10_ConstexprCrc_test 2022-11-23T01:17:17.9398117Z inflating: build/bin/c10_DeadlockDetection_test 2022-11-23T01:17:17.9450987Z inflating: build/bin/c10_Half_test 2022-11-23T01:17:17.9512286Z inflating: build/bin/c10_LeftRight_test 2022-11-23T01:17:17.9579343Z inflating: build/bin/c10_Metaprogramming_test 2022-11-23T01:17:17.9735294Z inflating: build/bin/c10_SmallVectorTest 2022-11-23T01:17:17.9788881Z inflating: build/bin/c10_Synchronized_test 2022-11-23T01:17:17.9850396Z inflating: build/bin/c10_ThreadLocal_test 2022-11-23T01:17:17.9906526Z inflating: build/bin/c10_TypeIndex_test 2022-11-23T01:17:17.9960367Z inflating: build/bin/c10_TypeList_test 2022-11-23T01:17:18.0011828Z inflating: build/bin/c10_TypeTraits_test 2022-11-23T01:17:18.0066752Z inflating: build/bin/c10_accumulate_test 2022-11-23T01:17:18.0126534Z inflating: build/bin/c10_bfloat16_test 2022-11-23T01:17:18.0183998Z inflating: build/bin/c10_complex_math_test 2022-11-23T01:17:18.0243371Z inflating: build/bin/c10_complex_test 2022-11-23T01:17:18.0361185Z inflating: build/bin/c10_either_test 2022-11-23T01:17:18.0417127Z inflating: build/bin/c10_exception_test 2022-11-23T01:17:18.0470338Z inflating: build/bin/c10_flags_test 2022-11-23T01:17:18.0653516Z inflating: build/bin/c10_intrusive_ptr_test 2022-11-23T01:17:18.0707323Z inflating: build/bin/c10_irange_test 2022-11-23T01:17:18.0769390Z inflating: build/bin/c10_logging_test 2022-11-23T01:17:18.0849311Z inflating: build/bin/c10_optional_test 2022-11-23T01:17:18.0915920Z inflating: build/bin/c10_ordered_preserving_dict_test 2022-11-23T01:17:18.0973938Z inflating: build/bin/c10_registry_test 2022-11-23T01:17:18.1037553Z inflating: build/bin/c10_string_view_test 2022-11-23T01:17:18.1092467Z inflating: build/bin/c10_tempfile_test 2022-11-23T01:17:18.1152734Z inflating: build/bin/c10_intrusive_ptr_benchmark 2022-11-23T01:17:18.1213219Z inflating: build/bin/c10_typeid_test 2022-11-23T01:17:18.1736004Z inflating: build/bin/protoc-3.13.0.0 2022-11-23T01:17:18.2258493Z inflating: build/bin/protoc 2022-11-23T01:17:18.2310346Z inflating: build/bin/c10_cuda_CUDATest 2022-11-23T01:17:18.2626736Z inflating: build/bin/vec_test_all_types_DEFAULT 2022-11-23T01:17:18.2982810Z inflating: build/bin/vec_test_all_types_AVX2 2022-11-23T01:17:18.3047722Z inflating: build/bin/TCPStoreTest 2022-11-23T01:17:18.3105094Z inflating: build/bin/HashStoreTest 2022-11-23T01:17:18.3162701Z inflating: build/bin/FileStoreTest 2022-11-23T01:17:18.3178526Z inflating: build/bin/ProcessGroupMPITest 2022-11-23T01:17:18.3181695Z inflating: build/bin/example_allreduce 2022-11-23T01:17:18.3237841Z inflating: build/bin/Dimname_test 2022-11-23T01:17:18.3316335Z inflating: build/bin/Dict_test 2022-11-23T01:17:18.3384485Z inflating: build/bin/MaybeOwned_test 2022-11-23T01:17:18.3446215Z inflating: build/bin/NamedTensor_test 2022-11-23T01:17:18.3509495Z inflating: build/bin/apply_utils_test 2022-11-23T01:17:18.3572426Z inflating: build/bin/atest 2022-11-23T01:17:18.3638748Z inflating: build/bin/basic 2022-11-23T01:17:18.3696223Z inflating: build/bin/broadcast_test 2022-11-23T01:17:18.3758860Z inflating: build/bin/cpu_generator_test 2022-11-23T01:17:18.3814691Z inflating: build/bin/cpu_profiling_allocator_test 2022-11-23T01:17:18.3867880Z inflating: build/bin/dispatch_key_set_test 2022-11-23T01:17:18.3963167Z inflating: build/bin/cpu_rng_test 2022-11-23T01:17:18.4016165Z inflating: build/bin/dlconvertor_test 2022-11-23T01:17:18.4078451Z inflating: build/bin/extension_backend_test 2022-11-23T01:17:18.4138185Z inflating: build/bin/half_test 2022-11-23T01:17:18.4240247Z inflating: build/bin/ivalue_test 2022-11-23T01:17:18.4292834Z inflating: build/bin/lazy_tensor_test 2022-11-23T01:17:18.4350153Z inflating: build/bin/memory_format_test 2022-11-23T01:17:18.4407865Z inflating: build/bin/math_kernel_test 2022-11-23T01:17:18.4465425Z inflating: build/bin/memory_overlapping_test 2022-11-23T01:17:18.4525788Z inflating: build/bin/native_test 2022-11-23T01:17:18.4581631Z inflating: build/bin/mobile_memory_cleanup 2022-11-23T01:17:18.4635856Z inflating: build/bin/operator_name_test 2022-11-23T01:17:18.4689416Z inflating: build/bin/operators_test 2022-11-23T01:17:18.4745492Z inflating: build/bin/packedtensoraccessor_test 2022-11-23T01:17:18.4815864Z inflating: build/bin/pow_test 2022-11-23T01:17:18.4877203Z inflating: build/bin/quantized_test 2022-11-23T01:17:18.4930085Z inflating: build/bin/reduce_ops_test 2022-11-23T01:17:18.4984093Z inflating: build/bin/reportMemoryUsage_test 2022-11-23T01:17:18.5044580Z inflating: build/bin/scalar_tensor_test 2022-11-23T01:17:18.5105898Z inflating: build/bin/scalar_test 2022-11-23T01:17:18.5161751Z inflating: build/bin/stride_properties_test 2022-11-23T01:17:18.5246362Z inflating: build/bin/tensor_iterator_test 2022-11-23T01:17:18.5249006Z inflating: build/bin/thread_init_test 2022-11-23T01:17:18.5308667Z inflating: build/bin/type_ptr_test 2022-11-23T01:17:18.5368558Z inflating: build/bin/test_parallel 2022-11-23T01:17:18.5421394Z inflating: build/bin/variant_test 2022-11-23T01:17:18.5486877Z inflating: build/bin/type_test 2022-11-23T01:17:18.5542538Z inflating: build/bin/undefined_tensor_test 2022-11-23T01:17:18.5543897Z inflating: build/bin/verify_api_visibility 2022-11-23T01:17:18.5618820Z inflating: build/bin/vmap_test 2022-11-23T01:17:18.5673285Z inflating: build/bin/weakref_test 2022-11-23T01:17:18.5727761Z inflating: build/bin/wrapdim_test 2022-11-23T01:17:18.5791856Z inflating: build/bin/IListRef_test 2022-11-23T01:17:18.5909975Z inflating: build/bin/List_test 2022-11-23T01:17:18.5962386Z inflating: build/bin/xla_tensor_test 2022-11-23T01:17:18.6093495Z inflating: build/bin/kernel_function_legacy_test 2022-11-23T01:17:18.6197426Z inflating: build/bin/kernel_function_test 2022-11-23T01:17:18.6267092Z inflating: build/bin/KernelFunction_test 2022-11-23T01:17:18.6405894Z inflating: build/bin/kernel_lambda_legacy_test 2022-11-23T01:17:18.6517342Z inflating: build/bin/kernel_lambda_test 2022-11-23T01:17:18.6581416Z inflating: build/bin/kernel_stackbased_test 2022-11-23T01:17:18.6685462Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2022-11-23T01:17:18.6739357Z inflating: build/bin/CppSignature_test 2022-11-23T01:17:18.6790572Z inflating: build/bin/op_allowlist_test 2022-11-23T01:17:18.6847849Z inflating: build/bin/inline_container_test 2022-11-23T01:17:18.6908039Z inflating: build/bin/backend_fallback_test 2022-11-23T01:17:18.7222330Z inflating: build/bin/op_registration_test 2022-11-23T01:17:18.7278169Z inflating: build/bin/cuda_apply_test 2022-11-23T01:17:18.7342258Z inflating: build/bin/cuda_atomic_ops_test 2022-11-23T01:17:18.7399581Z inflating: build/bin/cuda_caching_host_allocator_test 2022-11-23T01:17:18.7471899Z inflating: build/bin/cuda_complex_math_test 2022-11-23T01:17:18.7524974Z inflating: build/bin/cuda_device_test 2022-11-23T01:17:18.7587472Z inflating: build/bin/cuda_complex_test 2022-11-23T01:17:18.7650727Z inflating: build/bin/cuda_cub_test 2022-11-23T01:17:18.7704196Z inflating: build/bin/cuda_dlconvertor_test 2022-11-23T01:17:18.7758078Z inflating: build/bin/cuda_integer_divider_test 2022-11-23T01:17:18.7829950Z inflating: build/bin/cuda_distributions_test 2022-11-23T01:17:18.7892710Z inflating: build/bin/cuda_generator_test 2022-11-23T01:17:18.7945215Z inflating: build/bin/cuda_half_test 2022-11-23T01:17:18.7997388Z inflating: build/bin/cuda_optional_test 2022-11-23T01:17:18.8053235Z inflating: build/bin/cuda_reportMemoryUsage_test 2022-11-23T01:17:18.8118167Z inflating: build/bin/cuda_stream_test 2022-11-23T01:17:18.8173014Z inflating: build/bin/cuda_packedtensoraccessor_test 2022-11-23T01:17:18.8225204Z inflating: build/bin/cuda_cudnn_test 2022-11-23T01:17:18.8280918Z inflating: build/bin/cuda_vectorized_test 2022-11-23T01:17:18.8298585Z inflating: build/bin/tutorial_tensorexpr 2022-11-23T01:17:18.8368508Z inflating: build/bin/ProcessGroupGlooTest 2022-11-23T01:17:18.8430787Z inflating: build/bin/ProcessGroupGlooAsyncTest 2022-11-23T01:17:18.8497012Z inflating: build/bin/ProcessGroupNCCLTest 2022-11-23T01:17:18.8559884Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2022-11-23T01:17:18.8616460Z inflating: build/bin/ProcessGroupUCCTest 2022-11-23T01:17:18.8674279Z inflating: build/bin/test_dist_autograd 2022-11-23T01:17:18.8749202Z inflating: build/bin/test_cpp_rpc 2022-11-23T01:17:18.8751891Z inflating: build/bin/parallel_benchmark 2022-11-23T01:17:18.8826001Z inflating: build/bin/test_mobile_nnc 2022-11-23T01:17:18.8837412Z inflating: build/bin/aot_model_compiler_test 2022-11-23T01:17:18.9753545Z inflating: build/bin/test_tensorexpr 2022-11-23T01:17:19.0140570Z inflating: build/bin/test_lazy 2022-11-23T01:17:19.0146121Z inflating: build/bin/torch_shm_manager 2022-11-23T01:17:19.1473114Z inflating: build/bin/test_api 2022-11-23T01:17:19.2684576Z inflating: build/bin/test_jit 2022-11-23T01:17:19.2687905Z inflating: .pytorch-test-times.json 2022-11-23T01:17:19.2717414Z ##[group]Run df -H 2022-11-23T01:17:19.2717669Z df -H 2022-11-23T01:17:19.2730752Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:17:19.2731034Z env: 2022-11-23T01:17:19.2731281Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:17:19.2731561Z GPU_FLAG: --gpus all 2022-11-23T01:17:19.2731797Z ##[endgroup] 2022-11-23T01:17:19.2771491Z Filesystem Size Used Avail Use% Mounted on 2022-11-23T01:17:19.2772212Z devtmpfs 129G 0 129G 0% /dev 2022-11-23T01:17:19.2772834Z tmpfs 129G 6.3M 129G 1% /dev/shm 2022-11-23T01:17:19.2773127Z tmpfs 129G 553k 129G 1% /run 2022-11-23T01:17:19.2773432Z tmpfs 129G 0 129G 0% /sys/fs/cgroup 2022-11-23T01:17:19.2773732Z /dev/xvda1 162G 29G 134G 18% / 2022-11-23T01:17:19.2799720Z ##[group]Run .github/scripts/parse_ref.py 2022-11-23T01:17:19.2800360Z .github/scripts/parse_ref.py 2022-11-23T01:17:19.2815588Z shell: /usr/bin/bash -e {0} 2022-11-23T01:17:19.2815853Z env: 2022-11-23T01:17:19.2816082Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:17:19.2816360Z GPU_FLAG: --gpus all 2022-11-23T01:17:19.2816616Z ##[endgroup] 2022-11-23T01:17:19.3112406Z ##[group]Run set -x 2022-11-23T01:17:19.3112787Z set -x 2022-11-23T01:17:19.3113020Z  2022-11-23T01:17:19.3113279Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2022-11-23T01:17:19.3113635Z  TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh 2022-11-23T01:17:19.3113992Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2022-11-23T01:17:19.3114304Z  TEST_COMMAND=.jenkins/caffe2/test.sh 2022-11-23T01:17:19.3114585Z else 2022-11-23T01:17:19.3114867Z  TEST_COMMAND=.jenkins/pytorch/test.sh 2022-11-23T01:17:19.3115596Z fi 2022-11-23T01:17:19.3115811Z  2022-11-23T01:17:19.3116137Z COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}") 2022-11-23T01:17:19.3116461Z  2022-11-23T01:17:19.3116748Z # sanitize the input commit message and PR body here: 2022-11-23T01:17:19.3117044Z # 2022-11-23T01:17:19.3117428Z # trim all new lines from commit messages + PR_BODY to avoid issues with batch environment 2022-11-23T01:17:19.3117920Z # variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028 2022-11-23T01:17:19.3118345Z COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}" 2022-11-23T01:17:19.3118667Z PR_BODY="${PR_BODY//[$'\n\r']}" 2022-11-23T01:17:19.3118910Z  2022-11-23T01:17:19.3119269Z # then trim all special characters like single and double quotes to avoid unescaped inputs to 2022-11-23T01:17:19.3119651Z # wreak havoc internally 2022-11-23T01:17:19.3119982Z export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}" 2022-11-23T01:17:19.3120299Z export PR_BODY="${PR_BODY//[\'\"]}" 2022-11-23T01:17:19.3120564Z  2022-11-23T01:17:19.3120876Z # detached container should get cleaned up by teardown_ec2_linux 2022-11-23T01:17:19.3121271Z # TODO: Stop building test binaries as part of the build phase 2022-11-23T01:17:19.3121650Z # Used for GPU_FLAG since that doesn't play nice 2022-11-23T01:17:19.3121986Z # shellcheck disable=SC2086,SC2090 2022-11-23T01:17:19.3122292Z container_name=$(docker run \ 2022-11-23T01:17:19.3122559Z  ${GPU_FLAG:-} \ 2022-11-23T01:17:19.3122839Z  -e BUILD_ENVIRONMENT \ 2022-11-23T01:17:19.3123114Z  -e PR_NUMBER \ 2022-11-23T01:17:19.3123367Z  -e GITHUB_ACTIONS \ 2022-11-23T01:17:19.3123629Z  -e BASE_SHA \ 2022-11-23T01:17:19.3123881Z  -e BRANCH \ 2022-11-23T01:17:19.3124255Z  -e SHA1 \ 2022-11-23T01:17:19.3124516Z  -e AWS_DEFAULT_REGION \ 2022-11-23T01:17:19.3124792Z  -e IN_WHEEL_TEST \ 2022-11-23T01:17:19.3125040Z  -e SHARD_NUMBER \ 2022-11-23T01:17:19.3125301Z  -e TEST_CONFIG \ 2022-11-23T01:17:19.3125568Z  -e NUM_TEST_SHARDS \ 2022-11-23T01:17:19.3125816Z  -e PR_BODY \ 2022-11-23T01:17:19.3126086Z  -e COMMIT_MESSAGES \ 2022-11-23T01:17:19.3126382Z  -e PYTORCH_RETRY_TEST_CASES \ 2022-11-23T01:17:19.3126681Z  -e PYTORCH_OVERRIDE_FLAKY_SIGNAL \ 2022-11-23T01:17:19.3126971Z  -e PR_LABELS \ 2022-11-23T01:17:19.3127265Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2022-11-23T01:17:19.3127560Z  -e SCCACHE_BUCKET \ 2022-11-23T01:17:19.3127825Z  -e SCCACHE_S3_KEY_PREFIX \ 2022-11-23T01:17:19.3128099Z  -e XLA_CUDA \ 2022-11-23T01:17:19.3128389Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2022-11-23T01:17:19.3128700Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2022-11-23T01:17:19.3129034Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2022-11-23T01:17:19.3129388Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2022-11-23T01:17:19.3129699Z  --ulimit stack=10485760:83886080 \ 2022-11-23T01:17:19.3130015Z  --security-opt seccomp=unconfined \ 2022-11-23T01:17:19.3130420Z  --cap-add=SYS_PTRACE \ 2022-11-23T01:17:19.3130709Z  --ipc=host \ 2022-11-23T01:17:19.3130961Z  --shm-size="${SHM_SIZE}" \ 2022-11-23T01:17:19.3131224Z  --tty \ 2022-11-23T01:17:19.3131467Z  --detach \ 2022-11-23T01:17:19.3131721Z  --name="${container_name}" \ 2022-11-23T01:17:19.3132002Z  --user jenkins \ 2022-11-23T01:17:19.3132329Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2022-11-23T01:17:19.3132659Z  -w /var/lib/jenkins/workspace \ 2022-11-23T01:17:19.3132948Z  "${DOCKER_IMAGE}" 2022-11-23T01:17:19.3133192Z ) 2022-11-23T01:17:19.3133478Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2022-11-23T01:17:19.3133932Z docker exec -t "${container_name}" sh -c "pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2022-11-23T01:17:19.3145662Z shell: /usr/bin/bash -e {0} 2022-11-23T01:17:19.3145901Z env: 2022-11-23T01:17:19.3146148Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:17:19.3146419Z GPU_FLAG: --gpus all 2022-11-23T01:17:19.3146749Z BUILD_ENVIRONMENT: linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T01:17:19.3147047Z PR_NUMBER: 2022-11-23T01:17:19.3147286Z BRANCH: master 2022-11-23T01:17:19.3147564Z SHA1: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:17:19.3147872Z BASE_SHA: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:17:19.3148174Z PYTORCH_RETRY_TEST_CASES: 1 2022-11-23T01:17:19.3148463Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-11-23T01:17:19.3148727Z TEST_CONFIG: distributed 2022-11-23T01:17:19.3148987Z SHARD_NUMBER: 3 2022-11-23T01:17:19.3149230Z NUM_TEST_SHARDS: 3 2022-11-23T01:17:19.3149452Z PR_BODY: 2022-11-23T01:17:19.3149758Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2022-11-23T01:17:19.3150091Z SCCACHE_S3_KEY_PREFIX: trunk 2022-11-23T01:17:19.3150335Z SHM_SIZE: 2g 2022-11-23T01:17:19.3150829Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:17:19.3151305Z XLA_CUDA: 2022-11-23T01:17:19.3151661Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2022-11-23T01:17:19.3152034Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2022-11-23T01:17:19.3152337Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2022-11-23T01:17:19.3152616Z ##[endgroup] 2022-11-23T01:17:19.3181521Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2022-11-23T01:17:19.3182040Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *onnx* ]] 2022-11-23T01:17:19.3182519Z + TEST_COMMAND=.jenkins/pytorch/test.sh 2022-11-23T01:17:19.3185422Z ++ git cherry -v origin/master 2022-11-23T01:17:19.3202905Z + COMMIT_MESSAGES= 2022-11-23T01:17:19.3203172Z + COMMIT_MESSAGES= 2022-11-23T01:17:19.3203404Z + PR_BODY= 2022-11-23T01:17:19.3203664Z + export COMMIT_MESSAGES= 2022-11-23T01:17:19.3203930Z + COMMIT_MESSAGES= 2022-11-23T01:17:19.3204187Z + export PR_BODY= 2022-11-23T01:17:19.3204409Z + PR_BODY= 2022-11-23T01:17:19.3213232Z +++ nproc --ignore=2 2022-11-23T01:17:19.3226033Z ++ docker run --gpus all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e PR_BODY -e COMMIT_MESSAGES -e PYTORCH_RETRY_TEST_CASES -e PYTORCH_OVERRIDE_FLAKY_SIGNAL -e PR_LABELS -e MAX_JOBS=30 -e SCCACHE_BUCKET -e SCCACHE_S3_KEY_PREFIX -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS --env-file=/tmp/github_env_3528293554 --ulimit stack=10485760:83886080 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:17:33.3382220Z + container_name=d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T01:17:33.3382826Z + echo DOCKER_CONTAINER_ID=d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T01:17:33.3387291Z ++ echo dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:17:33.3389303Z + docker exec -t d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 sh -c 'pip install dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl[opt-einsum] && .jenkins/pytorch/test.sh' 2022-11-23T01:17:33.8938760Z Processing ./dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:17:34.8512283Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (4.4.0) 2022-11-23T01:17:34.8515248Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (1.11.1) 2022-11-23T01:17:34.8520587Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (2.6.3) 2022-11-23T01:17:34.8537428Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (3.3.0) 2022-11-23T01:17:34.8616883Z Requirement already satisfied: numpy>=1.7 in /opt/conda/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==1.14.0a0+git1cfd385) (1.21.2) 2022-11-23T01:17:34.8835003Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch==1.14.0a0+git1cfd385) (1.2.1) 2022-11-23T01:17:35.7926514Z Installing collected packages: torch 2022-11-23T01:17:45.6101260Z Successfully installed torch-1.14.0a0+git1cfd385 2022-11-23T01:17:45.6768867Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2022-11-23T01:17:45.6990542Z + TORCH_INSTALL_DIR=/opt/conda/lib/python3.10/site-packages/torch 2022-11-23T01:17:45.6991379Z + TORCH_BIN_DIR=/opt/conda/lib/python3.10/site-packages/torch/bin 2022-11-23T01:17:45.6992070Z + TORCH_LIB_DIR=/opt/conda/lib/python3.10/site-packages/torch/lib 2022-11-23T01:17:45.6993998Z + TORCH_TEST_DIR=/opt/conda/lib/python3.10/site-packages/torch/test 2022-11-23T01:17:45.6994421Z + BUILD_DIR=build 2022-11-23T01:17:45.6994951Z + BUILD_RENAMED_DIR=build_renamed 2022-11-23T01:17:45.6995322Z + BUILD_BIN_DIR=build/bin 2022-11-23T01:17:45.6995661Z + export VALGRIND=ON 2022-11-23T01:17:45.6996426Z + VALGRIND=ON 2022-11-23T01:17:45.6996971Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *clang9* ]] 2022-11-23T01:17:45.6997441Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 != *bazel* ]] 2022-11-23T01:17:45.6998159Z ++ realpath build/custom_test_artifacts 2022-11-23T01:17:45.7005146Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2022-11-23T01:17:45.7008352Z ++ dirname .jenkins/pytorch/test.sh 2022-11-23T01:17:45.7015594Z + source .jenkins/pytorch/common.sh 2022-11-23T01:17:45.7019985Z +++ dirname .jenkins/pytorch/common.sh 2022-11-23T01:17:45.7030499Z ++ source .jenkins/pytorch/common_utils.sh 2022-11-23T01:17:45.7032988Z +++ declare -f -t trap_add 2022-11-23T01:17:45.7039350Z ++ set -ex 2022-11-23T01:17:45.7039932Z ++ [[ linux-bionic-cuda11.7-py3.10-gcc7 == *rocm* ]] 2022-11-23T01:17:45.7040372Z ++ BUILD_TEST_LIBTORCH=0 2022-11-23T01:17:45.7041181Z + echo 'Environment variables' 2022-11-23T01:17:45.7041526Z Environment variables 2022-11-23T01:17:45.7041841Z + env 2022-11-23T01:17:45.7049370Z SHARD_NUMBER=3 2022-11-23T01:17:45.7050248Z NV_LIBCUBLAS_DEV_VERSION=11.10.1.25-1 2022-11-23T01:17:45.7051324Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-11-7 2022-11-23T01:17:45.7051722Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2022-11-23T01:17:45.7052615Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.13.4-1+cuda11.7 2022-11-23T01:17:45.7053885Z UCC_HOME=/usr 2022-11-23T01:17:45.7054588Z BUILD_ENVIRONMENT=linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T01:17:45.7055419Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2022-11-23T01:17:45.7056051Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-11-7=11.7.3.21-1 2022-11-23T01:17:45.7057236Z INSTALLED_DB=yes 2022-11-23T01:17:45.7057526Z HOSTNAME=d8f8c46cdf70 2022-11-23T01:17:45.7057878Z GITHUB_REF_NAME=master 2022-11-23T01:17:45.7105244Z GITHUB_API_URL=https://api.github.com 2022-11-23T01:17:45.7105840Z OPENSSL_DIR=/opt/openssl 2022-11-23T01:17:45.7106412Z UCC_COMMIT=1c7a7127186e7836f73aafbd7697bbc274a77eee 2022-11-23T01:17:45.7107576Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_7a141d6b-0425-49c5-949d-bbddc123cb57 2022-11-23T01:17:45.7108331Z CUDA_PATH=/usr/local/cuda 2022-11-23T01:17:45.7108890Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2022-11-23T01:17:45.7109461Z GITHUB_RUN_ATTEMPT=1 2022-11-23T01:17:45.7109920Z TEST_CONFIG=distributed 2022-11-23T01:17:45.7110358Z NV_LIBNPP_VERSION=11.7.3.21-1 2022-11-23T01:17:45.7111069Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-11-7=11.7.50-1 2022-11-23T01:17:45.7111644Z GITHUB_REPOSITORY_OWNER=pytorch 2022-11-23T01:17:45.7112110Z GITHUB_ACTIONS=true 2022-11-23T01:17:45.7112573Z NVIDIA_VISIBLE_DEVICES=all 2022-11-23T01:17:45.7113155Z NV_NVPROF_VERSION=11.7.50-1 2022-11-23T01:17:45.7113522Z NV_LIBCUSPARSE_VERSION=11.7.3.50-1 2022-11-23T01:17:45.7113799Z CI=true 2022-11-23T01:17:45.7114052Z PYTORCH_OVERRIDE_FLAKY_SIGNAL=1 2022-11-23T01:17:45.7114452Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-11-7=11.10.1.25-1 2022-11-23T01:17:45.7114770Z BRANCH=master 2022-11-23T01:17:45.7115013Z GITHUB_HEAD_REF= 2022-11-23T01:17:45.7115624Z UCX_COMMIT=31e74cac7bee0ef66bef2af72e7d86d9c282e5ab 2022-11-23T01:17:45.7115956Z GITHUB_ACTOR=pytorchmergebot 2022-11-23T01:17:45.7116304Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2022-11-23T01:17:45.7116594Z GITHUB_ACTION_REF= 2022-11-23T01:17:45.7116866Z NCCL_VERSION=2.13.4-1 2022-11-23T01:17:45.7117114Z GITHUB_ACTION=__self 2022-11-23T01:17:45.7117346Z VALGRIND=ON 2022-11-23T01:17:45.7117608Z GITHUB_REF_PROTECTED=true 2022-11-23T01:17:45.7118064Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2022-11-23T01:17:45.7118441Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2022-11-23T01:17:45.7119001Z *** 2022-11-23T01:17:45.7119231Z INSTALLED_VISION=yes 2022-11-23T01:17:45.7119488Z NVARCH=x86_64 2022-11-23T01:17:45.7119789Z NV_LIBCUSPARSE_DEV_VERSION=11.7.3.50-1 2022-11-23T01:17:45.7120077Z HOME=/var/lib/jenkins 2022-11-23T01:17:45.7120630Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_7a141d6b-0425-49c5-949d-bbddc123cb57 2022-11-23T01:17:45.7121043Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2022-11-23T01:17:45.7121355Z GITHUB_ACTION_REPOSITORY= 2022-11-23T01:17:45.7121772Z GITHUB_REF_TYPE=branch 2022-11-23T01:17:45.7122089Z NV_LIBNCCL_PACKAGE_VERSION=2.13.4-1 2022-11-23T01:17:45.7122366Z GITHUB_RETENTION_DAYS=90 2022-11-23T01:17:45.7122746Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2022-11-23T01:17:45.7123149Z NV_LIBNCCL_PACKAGE=libnccl2=2.13.4-1+cuda11.7 2022-11-23T01:17:45.7123695Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_7a141d6b-0425-49c5-949d-bbddc123cb57 2022-11-23T01:17:45.7124097Z DEBIAN_FRONTEND=noninteractive 2022-11-23T01:17:45.7124451Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2022-11-23T01:17:45.7124727Z GITHUB_REF=refs/heads/master 2022-11-23T01:17:45.7125028Z NV_CUDA_LIB_VERSION=11.7.0-1 2022-11-23T01:17:45.7125347Z GITHUB_SHA=1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:17:45.7125652Z INSTALLED_PROTOBUF=yes 2022-11-23T01:17:45.7125903Z GITHUB_RUN_ID=3528293554 2022-11-23T01:17:45.7126263Z NV_LIBNPP_PACKAGE=libnpp-11-7=11.7.3.21-1 2022-11-23T01:17:45.7126577Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2022-11-23T01:17:45.7126876Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2022-11-23T01:17:45.7127204Z NV_NVTX_VERSION=11.7.50-1 2022-11-23T01:17:45.7127517Z GITHUB_SERVER_URL=https://github.com 2022-11-23T01:17:45.7127787Z MAX_JOBS=30 2022-11-23T01:17:45.7128082Z NV_LIBCUBLAS_VERSION=11.10.1.25-1 2022-11-23T01:17:45.7128472Z NV_LIBCUBLAS_PACKAGE=libcublas-11-7=11.10.1.25-1 2022-11-23T01:17:45.7129046Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2022-11-23T01:17:45.7129420Z UCX_HOME=/usr 2022-11-23T01:17:45.7129686Z PYTORCH_RETRY_TEST_CASES=1 2022-11-23T01:17:45.7130013Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2022-11-23T01:17:45.7130379Z BASE_SHA=1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:17:45.7130735Z NV_CUDA_CUDART_DEV_VERSION=11.7.60-1 2022-11-23T01:17:45.7130992Z PR_BODY= 2022-11-23T01:17:45.7131233Z GITHUB_BASE_REF= 2022-11-23T01:17:45.7131477Z TERM=xterm 2022-11-23T01:17:45.7131689Z XLA_CUDA= 2022-11-23T01:17:45.7131973Z NV_NVML_DEV_VERSION=11.7.50-1 2022-11-23T01:17:45.7132258Z TORCH_CUDA_ARCH_LIST=Maxwell 2022-11-23T01:17:45.7132513Z CUDA_VERSION=11.7.0 2022-11-23T01:17:45.7132868Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-11-7 2022-11-23T01:17:45.7133183Z OPENSSL_ROOT_DIR=/opt/openssl 2022-11-23T01:17:45.7133732Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_7a141d6b-0425-49c5-949d-bbddc123cb57 2022-11-23T01:17:45.7134141Z GITHUB_JOB=test 2022-11-23T01:17:45.7134410Z SCCACHE_S3_KEY_PREFIX=trunk 2022-11-23T01:17:45.7134680Z COMMIT_MESSAGES= 2022-11-23T01:17:45.7134963Z NVIDIA_DRIVER_CAPABILITIES=compute,utility 2022-11-23T01:17:45.7135262Z NUM_TEST_SHARDS=3 2022-11-23T01:17:45.7135510Z PR_NUMBER= 2022-11-23T01:17:45.7136034Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_7a141d6b-0425-49c5-949d-bbddc123cb57 2022-11-23T01:17:45.7136430Z SHLVL=1 2022-11-23T01:17:45.7136784Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-11-7 2022-11-23T01:17:45.7137118Z GITHUB_REPOSITORY=pytorch/pytorch 2022-11-23T01:17:45.7138309Z NVIDIA_REQUIRE_CUDA=cuda>=11.7 brand=tesla,driver>=450,driver<451 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=unknown,driver>=510,driver<511 brand=nvidia,driver>=510,driver<511 brand=nvidiartx,driver>=510,driver<511 brand=quadro,driver>=510,driver<511 brand=quadrortx,driver>=510,driver<511 brand=titan,driver>=510,driver<511 brand=titanrtx,driver>=510,driver<511 brand=geforce,driver>=510,driver<511 brand=geforcertx,driver>=510,driver<511 2022-11-23T01:17:45.7139514Z NV_LIBNPP_DEV_VERSION=11.7.3.21-1 2022-11-23T01:17:45.7139835Z SHA1=1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:17:45.7140133Z GITHUB_EVENT_NAME=push 2022-11-23T01:17:45.7140495Z NV_CUDA_CUDART_VERSION=11.7.60-1 2022-11-23T01:17:45.7140859Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2022-11-23T01:17:45.7141162Z GITHUB_RUN_NUMBER=18336 2022-11-23T01:17:45.7141413Z GITHUB_WORKFLOW=trunk 2022-11-23T01:17:45.7141850Z PATH=/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-11-23T01:17:45.7142329Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.13.4-1 2022-11-23T01:17:45.7142770Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:17:45.7143159Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2022-11-23T01:17:45.7143446Z _=/usr/bin/env 2022-11-23T01:17:45.7143744Z + echo 'Testing pytorch' 2022-11-23T01:17:45.7143993Z Testing pytorch 2022-11-23T01:17:45.7144274Z + export LANG=C.UTF-8 2022-11-23T01:17:45.7144550Z + LANG=C.UTF-8 2022-11-23T01:17:45.7144772Z + PR_NUMBER= 2022-11-23T01:17:45.7145043Z + [[ distributed == \d\e\f\a\u\l\t ]] 2022-11-23T01:17:45.7145358Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2022-11-23T01:17:45.7145767Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *rocm* ]] 2022-11-23T01:17:45.7146098Z + [[ distributed == \s\l\o\w ]] 2022-11-23T01:17:45.7146530Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *slow-gradcheck* ]] 2022-11-23T01:17:45.7146971Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *cuda* ]] 2022-11-23T01:17:45.7147334Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-11-23T01:17:45.7147724Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-11-23T01:17:45.7148023Z + [[ distributed == *crossref* ]] 2022-11-23T01:17:45.7148315Z + [[ distributed == *dynamo* ]] 2022-11-23T01:17:45.7148606Z + [[ distributed == *inductor* ]] 2022-11-23T01:17:45.7148989Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *rocm* ]] 2022-11-23T01:17:45.7149434Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 != *-bazel-* ]] 2022-11-23T01:17:45.7149827Z + pip_install --user ninja==1.10.2 2022-11-23T01:17:45.7150216Z + pip install --progress-bar off --user ninja==1.10.2 2022-11-23T01:17:46.2581754Z Collecting ninja==1.10.2 2022-11-23T01:17:46.2804975Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2022-11-23T01:17:47.1846558Z Installing collected packages: ninja 2022-11-23T01:17:47.1951277Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2022-11-23T01:17:47.1952154Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T01:17:47.2004156Z Successfully installed ninja-1.10.2 2022-11-23T01:17:47.2668689Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-11-23T01:17:47.2669476Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-11-23T01:17:47.2670560Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *asan* ]] 2022-11-23T01:17:47.2671516Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *-tsan* ]] 2022-11-23T01:17:47.2671923Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2022-11-23T01:17:47.2672247Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2022-11-23T01:17:47.2678787Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *tbb* ]] 2022-11-23T01:17:47.2693131Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *libtorch* ]] 2022-11-23T01:17:47.2694153Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *-bazel-* ]] 2022-11-23T01:17:47.2694707Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *-tsan* ]] 2022-11-23T01:17:47.2696662Z + cd test 2022-11-23T01:17:47.2697503Z + python -c 'import torch; print(torch.__config__.show())' 2022-11-23T01:17:48.9759342Z PyTorch built with: 2022-11-23T01:17:48.9759957Z - GCC 7.5 2022-11-23T01:17:48.9760590Z - C++ Version: 201402 2022-11-23T01:17:48.9761609Z - Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-11-23T01:17:48.9762493Z - Intel(R) MKL-DNN v2.7.0 (Git Hash 650085b2f3643aad05c629425983491d63b5c289) 2022-11-23T01:17:48.9762910Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-11-23T01:17:48.9763278Z - LAPACK is enabled (usually provided by MKL) 2022-11-23T01:17:48.9763623Z - NNPACK is enabled 2022-11-23T01:17:48.9763963Z - CPU capability usage: AVX2 2022-11-23T01:17:48.9764268Z - CUDA Runtime 11.7 2022-11-23T01:17:48.9764691Z - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52 2022-11-23T01:17:48.9765047Z - CuDNN 8.5 2022-11-23T01:17:48.9765325Z - Magma 2.6.1 2022-11-23T01:17:48.9768541Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Werror -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=1.14.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 2022-11-23T01:17:48.9770887Z 2022-11-23T01:17:49.2173730Z + cd test 2022-11-23T01:17:49.2174322Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2022-11-23T01:17:50.8267079Z ATen/Parallel: 2022-11-23T01:17:50.8282803Z at::get_num_threads() : 16 2022-11-23T01:17:50.8283164Z at::get_num_interop_threads() : 16 2022-11-23T01:17:50.8283478Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-11-23T01:17:50.8283768Z omp_get_max_threads() : 16 2022-11-23T01:17:50.8284417Z Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-11-23T01:17:50.8284846Z mkl_get_max_threads() : 16 2022-11-23T01:17:50.8285296Z Intel(R) MKL-DNN v2.7.0 (Git Hash 650085b2f3643aad05c629425983491d63b5c289) 2022-11-23T01:17:50.8285654Z std::thread::hardware_concurrency() : 32 2022-11-23T01:17:50.8285964Z Environment variables: 2022-11-23T01:17:50.8286266Z OMP_NUM_THREADS : [not set] 2022-11-23T01:17:50.8286533Z MKL_NUM_THREADS : [not set] 2022-11-23T01:17:50.8286825Z ATen parallel backend: OpenMP 2022-11-23T01:17:50.8287014Z 2022-11-23T01:17:51.0561708Z + [[ distributed == *backward* ]] 2022-11-23T01:17:51.0562254Z + [[ distributed == *xla* ]] 2022-11-23T01:17:51.0562575Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2022-11-23T01:17:51.0563135Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *libtorch* ]] 2022-11-23T01:17:51.0563489Z + [[ distributed == distributed ]] 2022-11-23T01:17:51.0563752Z + install_filelock 2022-11-23T01:17:51.0564021Z + pip_install filelock 2022-11-23T01:17:51.0564380Z + pip install --progress-bar off filelock 2022-11-23T01:17:51.5739117Z Collecting filelock 2022-11-23T01:17:51.5943541Z Downloading filelock-3.8.0-py3-none-any.whl (10 kB) 2022-11-23T01:17:52.4760865Z Installing collected packages: filelock 2022-11-23T01:17:52.5149282Z Successfully installed filelock-3.8.0 2022-11-23T01:17:52.5804707Z + install_triton 2022-11-23T01:17:52.5804989Z + local commit 2022-11-23T01:17:52.5805263Z + [[ distributed == *rocm* ]] 2022-11-23T01:17:52.5809252Z ++ get_pinned_commit triton 2022-11-23T01:17:52.5809898Z ++ cat .github/ci_commit_pins/triton.txt 2022-11-23T01:17:52.5824583Z + commit=0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:17:52.5825788Z + pip_install --user git+https://github.com/openai/triton@0d7e7532279e45672555e344646f5c19c3972331#subdirectory=python 2022-11-23T01:17:52.5826515Z + pip install --progress-bar off --user git+https://github.com/openai/triton@0d7e7532279e45672555e344646f5c19c3972331#subdirectory=python 2022-11-23T01:17:53.0542688Z Collecting git+https://github.com/openai/triton@0d7e7532279e45672555e344646f5c19c3972331#subdirectory=python 2022-11-23T01:17:53.0548809Z Cloning https://github.com/openai/triton (to revision 0d7e7532279e45672555e344646f5c19c3972331) to /tmp/pip-req-build-3ye12g8b 2022-11-23T01:17:53.0569706Z Running command git clone --filter=blob:none --quiet https://github.com/openai/triton /tmp/pip-req-build-3ye12g8b 2022-11-23T01:17:53.8338288Z Running command git rev-parse -q --verify 'sha^0d7e7532279e45672555e344646f5c19c3972331' 2022-11-23T01:17:53.8361049Z Running command git fetch -q https://github.com/openai/triton 0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:17:54.2400530Z Running command git checkout -q 0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:17:54.4311378Z Resolved https://github.com/openai/triton to commit 0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:17:54.4313287Z Running command git submodule update --init --recursive -q 2022-11-23T01:17:55.0415723Z Preparing metadata (setup.py) ... [?25l- done 2022-11-23T01:17:55.2557949Z [?25hCollecting cmake 2022-11-23T01:17:55.3142953Z Downloading cmake-3.25.0-py2.py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.7 MB) 2022-11-23T01:17:55.6854882Z Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from triton==2.0.0) (3.8.0) 2022-11-23T01:17:55.6858033Z Requirement already satisfied: torch in /opt/conda/lib/python3.10/site-packages (from triton==2.0.0) (1.14.0a0+git1cfd385) 2022-11-23T01:17:55.7115457Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch->triton==2.0.0) (4.4.0) 2022-11-23T01:17:55.7120253Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch->triton==2.0.0) (2.6.3) 2022-11-23T01:17:55.7124592Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch->triton==2.0.0) (1.11.1) 2022-11-23T01:17:55.7338580Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch->triton==2.0.0) (1.2.1) 2022-11-23T01:17:55.7410647Z Building wheels for collected packages: triton 2022-11-23T01:18:47.8613534Z Building wheel for triton (setup.py) ... [?25l- \ | / - \ | / - \ | done 2022-11-23T01:18:47.9097949Z [?25h Created wheel for triton: filename=triton-2.0.0-cp310-cp310-linux_x86_64.whl size=15377935 sha256=1f517f7fe991d3ccdfea7a4eaff73a4a92f2cc19569df28ca560ebd57df84fdf 2022-11-23T01:18:47.9100506Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/3f/1d/23/1c2bc47d618a44f9c949aea4b7e355e737a1f1ed208f009295 2022-11-23T01:18:47.9121453Z Successfully built triton 2022-11-23T01:18:48.7998419Z Installing collected packages: cmake, triton 2022-11-23T01:18:52.2491755Z Successfully installed cmake-3.25.0 triton-2.0.0 2022-11-23T01:18:52.3604537Z + pip_install --user jinja2 2022-11-23T01:18:52.3604982Z + pip install --progress-bar off --user jinja2 2022-11-23T01:18:52.8862896Z Collecting jinja2 2022-11-23T01:18:52.9102524Z Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB) 2022-11-23T01:18:52.9248154Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2) (2.1.1) 2022-11-23T01:18:53.8212349Z Installing collected packages: jinja2 2022-11-23T01:18:53.9289457Z Successfully installed jinja2-3.1.2 2022-11-23T01:18:53.9972690Z + test_distributed 2022-11-23T01:18:53.9973469Z + echo 'Testing distributed python tests' 2022-11-23T01:18:53.9974041Z Testing distributed python tests 2022-11-23T01:18:53.9976411Z + python test/run_test.py --distributed-tests --shard 3 3 --verbose 2022-11-23T01:18:56.2436761Z Ignoring disabled issues: [] 2022-11-23T01:18:56.2832874Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T01:18:56.2833465Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T01:18:56.2841235Z Found test time stats from artifacts 2022-11-23T01:18:56.2857488Z Selected tests: 2022-11-23T01:18:56.2857847Z distributed/algorithms/quantization/test_quantization 2022-11-23T01:18:56.2858227Z distributed/test_distributed_spawn 2022-11-23T01:18:56.2858536Z distributed/pipeline/sync/test_worker 2022-11-23T01:18:56.2858872Z distributed/pipeline/sync/test_pipeline 2022-11-23T01:18:56.2859221Z distributed/pipeline/sync/test_microbatch 2022-11-23T01:18:56.2859567Z distributed/pipeline/sync/test_deferred_batch_norm 2022-11-23T01:18:56.2863560Z distributed/pipeline/sync/test_bugs 2022-11-23T01:18:56.2863952Z distributed/pipeline/sync/skip/test_tracker 2022-11-23T01:18:56.2864313Z distributed/pipeline/sync/skip/test_leak 2022-11-23T01:18:56.2864660Z distributed/pipeline/sync/skip/test_api 2022-11-23T01:18:56.2864972Z distributed/elastic/timer/api_test 2022-11-23T01:18:56.2865310Z distributed/checkpoint/test_dedup_tensors 2022-11-23T01:18:56.2865680Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T01:18:56.2866022Z distributed/_composable/test_checkpoint 2022-11-23T01:18:56.2866538Z distributed/test_launcher 2022-11-23T01:18:56.2866869Z distributed/elastic/metrics/api_test 2022-11-23T01:18:56.2867663Z distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T01:18:56.2868363Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T01:18:56.2869070Z distributed/_tensor/parallel/test_view_sharding_dim_change 2022-11-23T01:18:56.2869705Z distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T01:18:56.2870037Z distributed/elastic/timer/local_timer_test 2022-11-23T01:18:56.2870745Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T01:18:56.2871437Z distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T01:18:56.2871766Z distributed/_tensor/test_view_ops 2022-11-23T01:18:56.2872079Z distributed/fsdp/test_fsdp_input 2022-11-23T01:18:56.2872422Z distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T01:18:56.2872794Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T01:18:56.2873120Z distributed/fsdp/test_fsdp_overlap 2022-11-23T01:18:56.2873465Z distributed/_tensor/parallel/test_tp_examples 2022-11-23T01:18:56.2873844Z distributed/checkpoint/test_file_system_checkpoint_cpu 2022-11-23T01:18:56.2874182Z distributed/_tensor/test_pointwise_ops 2022-11-23T01:18:56.2874504Z distributed/test_dynamo_distributed 2022-11-23T01:18:56.2874840Z distributed/fsdp/test_fsdp_ignored_modules 2022-11-23T01:18:56.2875665Z distributed/_tensor/parallel/test_tp_style 2022-11-23T01:18:56.2876310Z distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 2022-11-23T01:18:56.2876970Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T01:18:56.2877424Z distributed/_tensor/test_common_rules 2022-11-23T01:18:56.2877742Z distributed/fsdp/test_fsdp_comm 2022-11-23T01:18:56.2878042Z distributed/test_c10d_common 2022-11-23T01:18:56.2878353Z distributed/fsdp/test_fsdp_freezing_weights 2022-11-23T01:18:56.2878683Z distributed/_tensor/test_device_mesh 2022-11-23T01:18:56.2878989Z distributed/test_pg_wrapper 2022-11-23T01:18:56.2879282Z distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T01:18:56.2879592Z distributed/test_c10d_pypg 2022-11-23T01:18:56.2879915Z distributed/fsdp/test_fsdp_summon_full_params 2022-11-23T01:18:56.2880243Z distributed/test_c10d_gloo 2022-11-23T01:18:56.2880518Z distributed/fsdp/test_fsdp_core 2022-11-23T01:18:56.2883052Z Prioritized test from test file changes. 2022-11-23T01:18:56.2883579Z reordering tests for PR: 2022-11-23T01:18:56.2883843Z prioritized: [] 2022-11-23T01:18:56.2888629Z the rest: ['distributed/algorithms/quantization/test_quantization', 'distributed/test_distributed_spawn', 'distributed/pipeline/sync/test_worker', 'distributed/pipeline/sync/test_pipeline', 'distributed/pipeline/sync/test_microbatch', 'distributed/pipeline/sync/test_deferred_batch_norm', 'distributed/pipeline/sync/test_bugs', 'distributed/pipeline/sync/skip/test_tracker', 'distributed/pipeline/sync/skip/test_leak', 'distributed/pipeline/sync/skip/test_api', 'distributed/elastic/timer/api_test', 'distributed/checkpoint/test_dedup_tensors', 'distributed/_shard/sharded_tensor/ops/test_math_ops', 'distributed/_composable/test_checkpoint', 'distributed/test_launcher', 'distributed/elastic/metrics/api_test', 'distributed/_shard/sharded_optim/test_sharded_optim', 'distributed/_shard/sharded_tensor/test_megatron_prototype', 'distributed/_tensor/parallel/test_view_sharding_dim_change', 'distributed/fsdp/test_fsdp_pure_fp16', 'distributed/elastic/timer/local_timer_test', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag', 'distributed/_shard/sharded_tensor/ops/test_softmax', 'distributed/_tensor/test_view_ops', 'distributed/fsdp/test_fsdp_input', 'distributed/_shard/sharded_tensor/ops/test_init', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp', 'distributed/fsdp/test_fsdp_overlap', 'distributed/_tensor/parallel/test_tp_examples', 'distributed/checkpoint/test_file_system_checkpoint_cpu', 'distributed/_tensor/test_pointwise_ops', 'distributed/test_dynamo_distributed', 'distributed/fsdp/test_fsdp_ignored_modules', 'distributed/_tensor/parallel/test_tp_style', 'distributed/algorithms/ddp_comm_hooks/test_ddp_hooks', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops', 'distributed/_tensor/test_common_rules', 'distributed/fsdp/test_fsdp_comm', 'distributed/test_c10d_common', 'distributed/fsdp/test_fsdp_freezing_weights', 'distributed/_tensor/test_device_mesh', 'distributed/test_pg_wrapper', 'distributed/fsdp/test_fsdp_comm_hooks', 'distributed/test_c10d_pypg', 'distributed/fsdp/test_fsdp_summon_full_params', 'distributed/test_c10d_gloo', 'distributed/fsdp/test_fsdp_core'] 2022-11-23T01:18:56.2891865Z 2022-11-23T01:18:56.2892427Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T01:18:56.3092949Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T01:18:56.3348347Z parallel (file granularity) tests: 2022-11-23T01:18:56.3348860Z 2022-11-23T01:18:56.3349343Z serial (file granularity) tests: 2022-11-23T01:18:56.3349942Z distributed/algorithms/quantization/test_quantization 2022-11-23T01:18:56.3350577Z distributed/test_distributed_spawn 2022-11-23T01:18:56.3350962Z distributed/pipeline/sync/test_worker 2022-11-23T01:18:56.3351284Z distributed/pipeline/sync/test_pipeline 2022-11-23T01:18:56.3351625Z distributed/pipeline/sync/test_microbatch 2022-11-23T01:18:56.3351989Z distributed/pipeline/sync/test_deferred_batch_norm 2022-11-23T01:18:56.3352340Z distributed/pipeline/sync/test_bugs 2022-11-23T01:18:56.3352662Z distributed/pipeline/sync/skip/test_tracker 2022-11-23T01:18:56.3353004Z distributed/pipeline/sync/skip/test_leak 2022-11-23T01:18:56.3353339Z distributed/pipeline/sync/skip/test_api 2022-11-23T01:18:56.3353645Z distributed/elastic/timer/api_test 2022-11-23T01:18:56.3353972Z distributed/checkpoint/test_dedup_tensors 2022-11-23T01:18:56.3354330Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T01:18:56.3354663Z distributed/_composable/test_checkpoint 2022-11-23T01:18:56.3354966Z distributed/test_launcher 2022-11-23T01:18:56.3355763Z distributed/elastic/metrics/api_test 2022-11-23T01:18:56.3356379Z distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T01:18:56.3356990Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T01:18:56.3357389Z distributed/_tensor/parallel/test_view_sharding_dim_change 2022-11-23T01:18:56.3357749Z distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T01:18:56.3358224Z distributed/elastic/timer/local_timer_test 2022-11-23T01:18:56.3358590Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T01:18:56.3358966Z distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T01:18:56.3359284Z distributed/_tensor/test_view_ops 2022-11-23T01:18:56.3359592Z distributed/fsdp/test_fsdp_input 2022-11-23T01:18:56.3359923Z distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T01:18:56.3360279Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T01:18:56.3360620Z distributed/fsdp/test_fsdp_overlap 2022-11-23T01:18:56.3361005Z distributed/_tensor/parallel/test_tp_examples 2022-11-23T01:18:56.3361360Z distributed/checkpoint/test_file_system_checkpoint_cpu 2022-11-23T01:18:56.3361715Z distributed/_tensor/test_pointwise_ops 2022-11-23T01:18:56.3362036Z distributed/test_dynamo_distributed 2022-11-23T01:18:56.3362350Z distributed/fsdp/test_fsdp_ignored_modules 2022-11-23T01:18:56.3362693Z distributed/_tensor/parallel/test_tp_style 2022-11-23T01:18:56.3363059Z distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 2022-11-23T01:18:56.3363435Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T01:18:56.3363761Z distributed/_tensor/test_common_rules 2022-11-23T01:18:56.3364075Z distributed/fsdp/test_fsdp_comm 2022-11-23T01:18:56.3364371Z distributed/test_c10d_common 2022-11-23T01:18:56.3364674Z distributed/fsdp/test_fsdp_freezing_weights 2022-11-23T01:18:56.3365083Z distributed/_tensor/test_device_mesh 2022-11-23T01:18:56.3365405Z distributed/test_pg_wrapper 2022-11-23T01:18:56.3365690Z distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T01:18:56.3365990Z distributed/test_c10d_pypg 2022-11-23T01:18:56.3366318Z distributed/fsdp/test_fsdp_summon_full_params 2022-11-23T01:18:56.3366613Z distributed/test_c10d_gloo 2022-11-23T01:18:56.3366906Z distributed/fsdp/test_fsdp_core 2022-11-23T01:18:58.5400175Z Ignoring disabled issues: [] 2022-11-23T01:18:58.5473786Z Ignoring disabled issues: [] 2022-11-23T01:18:58.9619994Z Running distributed/algorithms/quantization/test_quantization ... [2022-11-23 01:18:58.961469] 2022-11-23T01:18:58.9627755Z /usr/bin/mpiexec 2022-11-23T01:18:58.9628849Z MPI not available -- MPI backend tests will be skipped 2022-11-23T01:18:58.9629698Z Map different backends to different shards for distributed/algorithms/quantization/test_quantization: {'gloo': 1, 'nccl': 2} 2022-11-23T01:18:58.9630125Z Shard 3: test should be run in 1 2022-11-23T01:18:58.9630432Z Shard 3: nccl should be run in 2 2022-11-23T01:18:58.9630725Z Shard 3: gloo should be run in 1 2022-11-23T01:18:58.9631047Z Shard 3: ucc should be run in 1 2022-11-23T01:18:58.9632297Z Running distributed/test_distributed_spawn ... [2022-11-23 01:18:58.962947] 2022-11-23T01:18:58.9641461Z /usr/bin/mpiexec 2022-11-23T01:18:58.9642339Z MPI not available -- MPI backend tests will be skipped 2022-11-23T01:18:58.9643069Z Map different backends to different shards for distributed/test_distributed_spawn: {'gloo': 1, 'nccl': 2, 'ucc': 3} 2022-11-23T01:18:58.9643489Z Shard 3: test should be run in 1 2022-11-23T01:18:58.9643801Z Shard 3: nccl should be run in 2 2022-11-23T01:18:58.9644073Z Shard 3: gloo should be run in 1 2022-11-23T01:18:58.9647043Z Running distributed tests for the ucc backend with env init_method in shard 3 of 3 2022-11-23T01:18:58.9652263Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 01:18:58.964926] 2022-11-23T01:43:17.7203773Z 2022-11-23T01:43:17.7206209Z Expand the folded group to see the log file of distributed/test_distributed_spawn 2022-11-23T01:43:17.7212212Z ##[group]PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_y2_41v57) 2022-11-23T01:43:17.7212783Z 2022-11-23T01:43:17.7270677Z , <__main__.TestDistBackendWithSpawn testMethod=test_3_level_hierarchical_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_Backend_enum_class>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_2D_Input>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Channels_Last>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_No_Affine>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_non_default_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_with_amp_and_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedSampler_padding>, <__main__.TestDistBackendWithSpawn testMethod=test_SyncBatchNorm_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_with_then_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_simple>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_with_empty>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_cat_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_stack_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_default_pg>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max_complex_unsupported>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_complex_unsupported_ops>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_result_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_average_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_global>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_group>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo_tags>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_mixed_backend_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_no_rank_zero_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_list_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_ring_exchange_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_self_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_tensor_err>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_without_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer_via_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce_return_future>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_comm_hook_logging>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_different_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_same_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_create_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_device>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_forward_backward_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_grad_div_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_post_localSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_pickling_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_ignore_params_arg>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_inference>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_join_model_equivalence>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_gpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_num_params_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_shape_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_err_ignore_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_error>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_namedtuple>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_python_error_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_returns_tensor_with_no_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_shared_grad_acc_unused_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_static_graph_nested_types>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_bn_training_vs_eval>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_module_states>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_join_disable>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs_stop_iteration_sync_bn>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_unused_params_rebuild_buckets_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_zero_output_features>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_group>, <__main__.TestDistBackendWithSpawn testMethod=test_detect_ddp_is_actually_static>, <__main__.TestDistBackendWithSpawn testMethod=test_different_graph_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_dump_DDP_relevant_env_vars>, <__main__.TestDistBackendWithSpawn testMethod=test_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_get_backend>, <__main__.TestDistBackendWithSpawn testMethod=test_get_future>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_group>, <__main__.TestDistBackendWithSpawn testMethod=test_invalid_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_irecv>, <__main__.TestDistBackendWithSpawn testMethod=test_isend>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_failure_order>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_rank_0_timeout>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allgather>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_reduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_high_priority_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_input_rank_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_negative_input_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_group_size_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_overlap_not_allowed>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_world_size_not_divisible_by_group_size>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_dict_module>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_tuple_module>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager_param_group>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_step_reload>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_stateless_api_with_ddp>, <__main__.TestDistBackendWithSpawn testMethod=test_static_graph_api_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_sync_bn_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_undefined_grad_parity_unused_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_without_logger>]> 2022-11-23T01:43:17.7343017Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7343999Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7344812Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7345557Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7346382Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7347268Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7348222Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7349171Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7350160Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7351182Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7352268Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7353478Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7354544Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7356136Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7357085Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7357949Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7358810Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7359660Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7360429Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7361307Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7362241Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7363146Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7363900Z test_all_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7364638Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7365607Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7366391Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7367192Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7367987Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7368724Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7369451Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7370139Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7370873Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7371586Z test_all_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7372319Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7373133Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7373875Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7374641Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7375473Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7376270Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7377001Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7377793Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7378811Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7379391Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7379864Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7380316Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7380767Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7381206Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7381658Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7382091Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7382541Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7383004Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7383543Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7383972Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7384397Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7384842Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7385269Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7385686Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7386117Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7388612Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7389015Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7389431Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7389849Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7390252Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7390624Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7391036Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7391462Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7391983Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7392389Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7392792Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7393194Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7393586Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7393996Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7394410Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7394838Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7395845Z test_all_to_all (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7396243Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7396639Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7397033Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7397452Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7397872Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7398268Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7398671Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7399094Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7399545Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7399992Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7400462Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7400940Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7401406Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7401878Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7402346Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7402800Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7403242Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7403712Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7404186Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7404768Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7405257Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7405729Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7406208Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7406638Z test_average_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7407048Z test_backend_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7407445Z test_backend_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7407809Z test_barrier (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7408189Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7408591Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7408992Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7409397Z test_barrier_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7409797Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7410199Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7410697Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7411128Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7411542Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7411950Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7412397Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7412833Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7413250Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7413690Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7414128Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7414581Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7415007Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7415445Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7415851Z test_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7416222Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7416628Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7417034Z test_broadcast_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7417423Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7417834Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7418320Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7418862Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7419323Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7419759Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7420194Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7420631Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7421142Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7421631Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7422094Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7422581Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7423049Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7423479Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7423854Z test_ddp_device (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7424265Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7424701Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7425120Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7425578Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7426042Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7426477Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7426894Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7427385Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7427909Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7428542Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7429208Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7429843Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7430481Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7431121Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7431756Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7432392Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7433027Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7433603Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7434106Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7434571Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7434969Z test_ddp_inference (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7469188Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7469666Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7470111Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7470573Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7471071Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7471596Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7472111Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7472575Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7473008Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7473448Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7474095Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7474580Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7475553Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7476100Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7476565Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7477022Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7477449Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7477874Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7478307Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7478749Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7479158Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7479600Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7480084Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7480526Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7481037Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7481456Z test_destroy_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7481857Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7482300Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7482740Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7483141Z test_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7483503Z test_gather_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7483895Z test_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7484291Z test_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7484669Z test_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7485056Z test_gather_object (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7485462Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7485849Z test_get_backend (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7486231Z test_get_future (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7486604Z test_get_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7486999Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7487398Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7487808Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7488190Z test_irecv (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7488543Z test_isend (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7488938Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7489360Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7489775Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7490250Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7490725Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7491155Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7491580Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7492037Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7492485Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7492998Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7493434Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7493867Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7494274Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7494705Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7495117Z test_new_subgroups (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7495531Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7496001Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7496517Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7497010Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7497467Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7497949Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7498425Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7498877Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7499354Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7499815Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7500275Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7500730Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7501233Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7501773Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7502287Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7502711Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7503124Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7503550Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7503958Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7504363Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7504758Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7505164Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7505554Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7505945Z test_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7506320Z test_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7506698Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7507094Z test_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7507506Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7507918Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7508315Z test_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7508703Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7509089Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7509490Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7509871Z test_scatter (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7510289Z test_scatter_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7510668Z test_scatter_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7511124Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7511524Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7511920Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7512314Z test_scatter_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7512711Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7513083Z test_send_recv (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7513476Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7513918Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7514382Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7514816Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7515741Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7516166Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7516604Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7517036Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7517449Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7517976Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7518428Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7518862Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7519286Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7519689Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7520083Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7520466Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7520930Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7521404Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7521853Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7522607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7523067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7523667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7524159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7524401Z 2022-11-23T01:43:17.7524496Z Running tests... 2022-11-23T01:43:17.7524914Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7525463Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7526084Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7526655Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 911 2022-11-23T01:43:17.7527119Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 912 2022-11-23T01:43:17.7527753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7528223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7528802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7529294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7529889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7530446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7531039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7531526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7532001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7532508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7533193Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7533912Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7534466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7534943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7535491Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7536410Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7537104Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7537958Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7538648Z [1669166348.369997] [d8f8c46cdf70:911 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7539232Z [1669166348.378596] [d8f8c46cdf70:912 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7539766Z [1669166348.376127] [d8f8c46cdf70:911 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7540253Z [1669166348.376127] [d8f8c46cdf70:911 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7540751Z [1669166348.384456] [d8f8c46cdf70:912 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7541243Z [1669166348.384456] [d8f8c46cdf70:912 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7541781Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7542642Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7543334Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7544183Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7544881Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7545719Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7546397Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7547303Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7547993Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7548833Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7549519Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:43:17.7550359Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:43:17.7550840Z ok (5.992s) 2022-11-23T01:43:17.7550991Z 2022-11-23T01:43:17.7551246Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7551575Z Ran 1 test in 5.992s 2022-11-23T01:43:17.7551740Z 2022-11-23T01:43:17.7551833Z OK 2022-11-23T01:43:17.7551970Z 2022-11-23T01:43:17.7552079Z Generating XML reports... 2022-11-23T01:43:17.7552700Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011903.xml 2022-11-23T01:43:17.7553498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7553978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7554562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7555427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7555753Z 2022-11-23T01:43:17.7555867Z Running tests... 2022-11-23T01:43:17.7556275Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7556824Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7557379Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.004s) 2022-11-23T01:43:17.7557699Z 2022-11-23T01:43:17.7557968Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7558284Z Ran 1 test in 0.004s 2022-11-23T01:43:17.7558450Z 2022-11-23T01:43:17.7558559Z OK (skipped=1) 2022-11-23T01:43:17.7558718Z 2022-11-23T01:43:17.7558844Z Generating XML reports... 2022-11-23T01:43:17.7559467Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011911.xml 2022-11-23T01:43:17.7560191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7560663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7561268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7561744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7561983Z 2022-11-23T01:43:17.7562093Z Running tests... 2022-11-23T01:43:17.7562511Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7563051Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7563563Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7564072Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1059 2022-11-23T01:43:17.7564540Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1060 2022-11-23T01:43:17.7565155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7565741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7566342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7566831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7567420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7567887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7568478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7568964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7569416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7569933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7570619Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7571323Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7571943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7572448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7572804Z ok (4.268s) 2022-11-23T01:43:17.7572942Z 2022-11-23T01:43:17.7573222Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7573557Z Ran 1 test in 4.268s 2022-11-23T01:43:17.7573724Z 2022-11-23T01:43:17.7573817Z OK 2022-11-23T01:43:17.7573957Z 2022-11-23T01:43:17.7574067Z Generating XML reports... 2022-11-23T01:43:17.7574697Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011914.xml 2022-11-23T01:43:17.7575436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7575905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7576490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7576984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7577216Z 2022-11-23T01:43:17.7577327Z Running tests... 2022-11-23T01:43:17.7577724Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7578262Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7578813Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7579913Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77317 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.599s) 2022-11-23T01:43:17.7580463Z 2022-11-23T01:43:17.7580733Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7581052Z Ran 1 test in 1.599s 2022-11-23T01:43:17.7581217Z 2022-11-23T01:43:17.7581326Z OK (skipped=1) 2022-11-23T01:43:17.7581485Z 2022-11-23T01:43:17.7581610Z Generating XML reports... 2022-11-23T01:43:17.7582214Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011921.xml 2022-11-23T01:43:17.7582950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7583488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7584090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7584567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7584808Z 2022-11-23T01:43:17.7584917Z Running tests... 2022-11-23T01:43:17.7585333Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7585874Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7586418Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7586952Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1196 2022-11-23T01:43:17.7587421Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1197 2022-11-23T01:43:17.7588044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7588512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7589111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7589652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7590249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7590720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7591312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7591781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7592254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7592781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7593464Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7594175Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7594726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7595698Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptx4v50ok 2022-11-23T01:43:17.7596270Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptx4v50ok/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7596783Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7597306Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn__ql8rj 2022-11-23T01:43:17.7597868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn__ql8rj/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7598387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7598896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7599451Z [1669166369.202948] [d8f8c46cdf70:1196 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7599989Z [1669166369.993587] [d8f8c46cdf70:1196 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7600473Z [1669166369.993587] [d8f8c46cdf70:1196 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7601018Z [1669166369.205176] [d8f8c46cdf70:1197 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7601650Z [1669166370.012102] [d8f8c46cdf70:1197 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7602143Z [1669166370.012102] [d8f8c46cdf70:1197 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7602485Z ok (5.584s) 2022-11-23T01:43:17.7602640Z 2022-11-23T01:43:17.7602926Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7603260Z Ran 1 test in 5.584s 2022-11-23T01:43:17.7603426Z 2022-11-23T01:43:17.7603520Z OK 2022-11-23T01:43:17.7603639Z 2022-11-23T01:43:17.7603768Z Generating XML reports... 2022-11-23T01:43:17.7604393Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011925.xml 2022-11-23T01:43:17.7605136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7605597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7606195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7606685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7606990Z 2022-11-23T01:43:17.7607110Z Running tests... 2022-11-23T01:43:17.7607507Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7608050Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7608627Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7609164Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1310 2022-11-23T01:43:17.7609633Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1311 2022-11-23T01:43:17.7610266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7610733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7611315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7611811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7612411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7612872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7613449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7613933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7614411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7614916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7615596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7616316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7616863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7617370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdk82kd9y 2022-11-23T01:43:17.7617931Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdk82kd9y/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7618460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7619049Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbt972sq9 2022-11-23T01:43:17.7619591Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbt972sq9/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7620126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7620637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7621226Z [1669166377.304237] [d8f8c46cdf70:1311 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7621765Z [1669166378.084598] [d8f8c46cdf70:1311 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7622266Z [1669166378.084598] [d8f8c46cdf70:1311 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7622811Z [1669166377.282974] [d8f8c46cdf70:1310 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7623321Z [1669166378.100268] [d8f8c46cdf70:1310 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7623872Z [1669166378.100268] [d8f8c46cdf70:1310 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7624238Z ok (5.557s) 2022-11-23T01:43:17.7624390Z 2022-11-23T01:43:17.7624670Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7624993Z Ran 1 test in 5.557s 2022-11-23T01:43:17.7625160Z 2022-11-23T01:43:17.7625250Z OK 2022-11-23T01:43:17.7625385Z 2022-11-23T01:43:17.7625511Z Generating XML reports... 2022-11-23T01:43:17.7626122Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011933.xml 2022-11-23T01:43:17.7626869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7627336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7627936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7628414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7628655Z 2022-11-23T01:43:17.7628768Z Running tests... 2022-11-23T01:43:17.7629175Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7629717Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7630274Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7630829Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1424 2022-11-23T01:43:17.7631296Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1425 2022-11-23T01:43:17.7631910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7632385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7632981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7633473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7634059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7634523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7635609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7636224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7636695Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7637215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7637918Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7638623Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7639169Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7639660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7640227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpob2zo328 2022-11-23T01:43:17.7640782Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpob2zo328/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7641339Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe5_d74xt 2022-11-23T01:43:17.7641892Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe5_d74xt/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7642538Z [1669166386.158120] [d8f8c46cdf70:1425 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7643091Z [1669166386.164957] [d8f8c46cdf70:1425 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7643595Z [1669166386.164957] [d8f8c46cdf70:1425 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7644133Z [1669166386.156025] [d8f8c46cdf70:1424 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7644673Z [1669166386.163159] [d8f8c46cdf70:1424 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7645151Z [1669166386.163159] [d8f8c46cdf70:1424 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7645655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7646154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7646642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7647141Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7647640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7648130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7648611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7649106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7649597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7650075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7650573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7651063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7651418Z ok (6.443s) 2022-11-23T01:43:17.7651554Z 2022-11-23T01:43:17.7651837Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7652179Z Ran 1 test in 6.443s 2022-11-23T01:43:17.7652420Z 2022-11-23T01:43:17.7652514Z OK 2022-11-23T01:43:17.7652650Z 2022-11-23T01:43:17.7652760Z Generating XML reports... 2022-11-23T01:43:17.7653389Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011941.xml 2022-11-23T01:43:17.7654135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7654610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7655190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7655681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7655919Z 2022-11-23T01:43:17.7656030Z Running tests... 2022-11-23T01:43:17.7656427Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7656970Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7657556Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7658115Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1542 2022-11-23T01:43:17.7658565Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1543 2022-11-23T01:43:17.7659249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7659727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7660329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7660802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7661403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7661867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7662444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7662928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7663402Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7663925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7664593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7665315Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7665862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7666363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7666868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9sm1_1ok 2022-11-23T01:43:17.7667425Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9sm1_1ok/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7667977Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk4zm3zd7 2022-11-23T01:43:17.7668518Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk4zm3zd7/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7669097Z [1669166395.170897] [d8f8c46cdf70:1543 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7669635Z [1669166395.176351] [d8f8c46cdf70:1543 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7670133Z [1669166395.176351] [d8f8c46cdf70:1543 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7670729Z [1669166395.166134] [d8f8c46cdf70:1542 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7671263Z [1669166395.171320] [d8f8c46cdf70:1542 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7671756Z [1669166395.171320] [d8f8c46cdf70:1542 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7672255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7672739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7673244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7673742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7674107Z ok (5.556s) 2022-11-23T01:43:17.7674242Z 2022-11-23T01:43:17.7674525Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7674862Z Ran 1 test in 5.556s 2022-11-23T01:43:17.7675259Z 2022-11-23T01:43:17.7675365Z OK 2022-11-23T01:43:17.7675507Z 2022-11-23T01:43:17.7675693Z Generating XML reports... 2022-11-23T01:43:17.7676341Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011950.xml 2022-11-23T01:43:17.7677081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7677552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7678133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7678634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7678876Z 2022-11-23T01:43:17.7678988Z Running tests... 2022-11-23T01:43:17.7679384Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7679925Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7680526Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7681093Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1660 2022-11-23T01:43:17.7681543Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1661 2022-11-23T01:43:17.7682171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7682642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7683222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7683718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7684322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7684792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7685370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7685858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7686331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7686856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7687526Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7688333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7688881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7689363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7689883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr_ywx379 2022-11-23T01:43:17.7690441Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr_ywx379/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7690998Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpntmpks1o 2022-11-23T01:43:17.7691541Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpntmpks1o/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7692126Z [1669166403.213051] [d8f8c46cdf70:1660 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7692669Z [1669166403.218911] [d8f8c46cdf70:1660 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7693231Z [1669166403.218911] [d8f8c46cdf70:1660 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7693774Z [1669166403.219052] [d8f8c46cdf70:1661 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7694307Z [1669166403.225198] [d8f8c46cdf70:1661 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7694802Z [1669166403.225198] [d8f8c46cdf70:1661 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7695300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7695797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7696303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7696805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7697305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7697789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7698287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7698779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7699127Z ok (5.720s) 2022-11-23T01:43:17.7699280Z 2022-11-23T01:43:17.7699561Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7699902Z Ran 1 test in 5.720s 2022-11-23T01:43:17.7700069Z 2022-11-23T01:43:17.7700163Z OK 2022-11-23T01:43:17.7700283Z 2022-11-23T01:43:17.7700410Z Generating XML reports... 2022-11-23T01:43:17.7701040Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011958.xml 2022-11-23T01:43:17.7701779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7702234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7702830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7703321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7703560Z 2022-11-23T01:43:17.7703670Z Running tests... 2022-11-23T01:43:17.7704064Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7704672Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7705289Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7705861Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1778 2022-11-23T01:43:17.7706331Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1779 2022-11-23T01:43:17.7706961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7707429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7708007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7708495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7709103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7709549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7710136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7710675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7711161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7711667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7712353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7713076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7713631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7714111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7714630Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgumlm0yg 2022-11-23T01:43:17.7715419Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgumlm0yg/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7715969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjsp57bht 2022-11-23T01:43:17.7716531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjsp57bht/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7717118Z [1669166411.420494] [d8f8c46cdf70:1778 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7717656Z [1669166411.425909] [d8f8c46cdf70:1778 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7718146Z [1669166411.425909] [d8f8c46cdf70:1778 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7718695Z [1669166411.420968] [d8f8c46cdf70:1779 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7719222Z [1669166411.426393] [d8f8c46cdf70:1779 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7719718Z [1669166411.426393] [d8f8c46cdf70:1779 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7720201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7720707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7721202Z ok (5.602s) 2022-11-23T01:43:17.7721354Z 2022-11-23T01:43:17.7721637Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7721955Z Ran 1 test in 5.603s 2022-11-23T01:43:17.7722126Z 2022-11-23T01:43:17.7722221Z OK 2022-11-23T01:43:17.7722358Z 2022-11-23T01:43:17.7722484Z Generating XML reports... 2022-11-23T01:43:17.7723102Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012006.xml 2022-11-23T01:43:17.7723849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7724318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7724918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7725397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7725644Z 2022-11-23T01:43:17.7725756Z Running tests... 2022-11-23T01:43:17.7726168Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7726696Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7727373Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7727969Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1896 2022-11-23T01:43:17.7728436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1897 2022-11-23T01:43:17.7729058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7729527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7730124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7730621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7731207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7731673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7732268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7732739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7733212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7733734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7734416Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7735126Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7735678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7736174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7736695Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7bddzd48 2022-11-23T01:43:17.7737241Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7bddzd48/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7737798Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe4x2mr0t 2022-11-23T01:43:17.7738360Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe4x2mr0t/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7738923Z [1669166419.658129] [d8f8c46cdf70:1896 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7739529Z [1669166419.664768] [d8f8c46cdf70:1896 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7740025Z [1669166419.664768] [d8f8c46cdf70:1896 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7740575Z [1669166419.666503] [d8f8c46cdf70:1897 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7741103Z [1669166419.673915] [d8f8c46cdf70:1897 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7741582Z [1669166419.673915] [d8f8c46cdf70:1897 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7742086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7742598Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7743086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7743587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7743947Z ok (6.259s) 2022-11-23T01:43:17.7744101Z 2022-11-23T01:43:17.7744431Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7744760Z Ran 1 test in 6.260s 2022-11-23T01:43:17.7744924Z 2022-11-23T01:43:17.7745018Z OK 2022-11-23T01:43:17.7745157Z 2022-11-23T01:43:17.7745284Z Generating XML reports... 2022-11-23T01:43:17.7745895Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012014.xml 2022-11-23T01:43:17.7746635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7747114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7747714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7748190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7748434Z 2022-11-23T01:43:17.7748544Z Running tests... 2022-11-23T01:43:17.7748956Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7749477Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7750061Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7750621Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2014 2022-11-23T01:43:17.7751091Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2015 2022-11-23T01:43:17.7751718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7752194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7752792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7753272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7753876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7754341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7754932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7755628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7756106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7756713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7757405Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7758110Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7758661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7759153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7759655Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpilpvg68q 2022-11-23T01:43:17.7760215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpilpvg68q/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7760775Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5kprfhg9 2022-11-23T01:43:17.7761340Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5kprfhg9/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7761908Z [1669166428.399851] [d8f8c46cdf70:2015 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7762525Z [1669166428.407063] [d8f8c46cdf70:2015 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7763036Z [1669166428.407063] [d8f8c46cdf70:2015 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7763577Z [1669166428.393190] [d8f8c46cdf70:2014 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7764089Z [1669166428.400017] [d8f8c46cdf70:2014 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7764593Z [1669166428.400017] [d8f8c46cdf70:2014 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7765095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7765599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7766084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7766581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7766942Z ok (5.908s) 2022-11-23T01:43:17.7767095Z 2022-11-23T01:43:17.7767375Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7767691Z Ran 1 test in 5.908s 2022-11-23T01:43:17.7767857Z 2022-11-23T01:43:17.7767949Z OK 2022-11-23T01:43:17.7768087Z 2022-11-23T01:43:17.7768217Z Generating XML reports... 2022-11-23T01:43:17.7768824Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012023.xml 2022-11-23T01:43:17.7769570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7770039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7770642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7771119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7771361Z 2022-11-23T01:43:17.7771472Z Running tests... 2022-11-23T01:43:17.7771884Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7772414Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7773030Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7773677Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2132 2022-11-23T01:43:17.7774144Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2133 2022-11-23T01:43:17.7774767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7775239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7775835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7776313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7776919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7777386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7777986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7778454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7778980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7779550Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7780253Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7780955Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7781504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7781999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7782510Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz1ikugu3 2022-11-23T01:43:17.7783077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz1ikugu3/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7783635Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp25_onl16 2022-11-23T01:43:17.7784193Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp25_onl16/_remote_module_non_scriptable.py 2022-11-23T01:43:17.7784755Z [1669166436.913734] [d8f8c46cdf70:2133 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7785294Z [1669166436.920962] [d8f8c46cdf70:2133 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7785795Z [1669166436.920962] [d8f8c46cdf70:2133 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7786348Z [1669166436.906788] [d8f8c46cdf70:2132 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7786863Z [1669166436.913338] [d8f8c46cdf70:2132 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7787367Z [1669166436.913338] [d8f8c46cdf70:2132 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7787867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7788373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7788861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7789363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.7789790Z ok (5.555s) 2022-11-23T01:43:17.7789943Z 2022-11-23T01:43:17.7790204Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7790537Z Ran 1 test in 5.556s 2022-11-23T01:43:17.7790705Z 2022-11-23T01:43:17.7790799Z OK 2022-11-23T01:43:17.7790934Z 2022-11-23T01:43:17.7791061Z Generating XML reports... 2022-11-23T01:43:17.7791673Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012032.xml 2022-11-23T01:43:17.7792419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7792889Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7793468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7793961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7794207Z 2022-11-23T01:43:17.7794317Z Running tests... 2022-11-23T01:43:17.7794727Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7795469Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7796136Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7797271Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/76428 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.603s) 2022-11-23T01:43:17.7797822Z 2022-11-23T01:43:17.7798090Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7798404Z Ran 1 test in 1.603s 2022-11-23T01:43:17.7798570Z 2022-11-23T01:43:17.7798686Z OK (skipped=1) 2022-11-23T01:43:17.7798846Z 2022-11-23T01:43:17.7798970Z Generating XML reports... 2022-11-23T01:43:17.7799595Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012040.xml 2022-11-23T01:43:17.7800318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7800797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7801396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7801873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7802111Z 2022-11-23T01:43:17.7802220Z Running tests... 2022-11-23T01:43:17.7802627Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7803168Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7803727Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7804275Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2284 2022-11-23T01:43:17.7804744Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2285 2022-11-23T01:43:17.7805362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7805828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7806425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7806914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7807504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7808066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7808661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7809145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7809605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7810131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7810815Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7811525Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7812075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7812573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7812931Z ok (4.361s) 2022-11-23T01:43:17.7813069Z 2022-11-23T01:43:17.7813341Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7813678Z Ran 1 test in 4.361s 2022-11-23T01:43:17.7813844Z 2022-11-23T01:43:17.7813938Z OK 2022-11-23T01:43:17.7814075Z 2022-11-23T01:43:17.7814238Z Generating XML reports... 2022-11-23T01:43:17.7814873Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012044.xml 2022-11-23T01:43:17.7815610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7816080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7816664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7817158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7817397Z 2022-11-23T01:43:17.7817507Z Running tests... 2022-11-23T01:43:17.7817904Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7818445Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7819037Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7820155Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77294 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.609s) 2022-11-23T01:43:17.7820702Z 2022-11-23T01:43:17.7821009Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7821337Z Ran 1 test in 1.609s 2022-11-23T01:43:17.7821505Z 2022-11-23T01:43:17.7821613Z OK (skipped=1) 2022-11-23T01:43:17.7821771Z 2022-11-23T01:43:17.7821896Z Generating XML reports... 2022-11-23T01:43:17.7822505Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012051.xml 2022-11-23T01:43:17.7823247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7823722Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7824316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7824791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7825030Z 2022-11-23T01:43:17.7825140Z Running tests... 2022-11-23T01:43:17.7825550Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7826150Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7826698Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7827224Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2421 2022-11-23T01:43:17.7827695Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2422 2022-11-23T01:43:17.7828310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7828784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7829378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7829846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7830446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7830914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7831504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7832023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7832509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7833030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7833718Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7834422Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7834975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7835695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7836243Z [1669166460.120626] [d8f8c46cdf70:2421 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7836768Z [1669166460.126655] [d8f8c46cdf70:2421 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7837269Z [1669166460.126655] [d8f8c46cdf70:2421 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7837811Z [1669166460.128791] [d8f8c46cdf70:2422 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7838343Z [1669166460.135922] [d8f8c46cdf70:2422 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7838828Z [1669166460.135922] [d8f8c46cdf70:2422 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7839191Z ok (5.563s) 2022-11-23T01:43:17.7839343Z 2022-11-23T01:43:17.7839622Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7839938Z Ran 1 test in 5.563s 2022-11-23T01:43:17.7840108Z 2022-11-23T01:43:17.7840202Z OK 2022-11-23T01:43:17.7840339Z 2022-11-23T01:43:17.7840466Z Generating XML reports... 2022-11-23T01:43:17.7841096Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012055.xml 2022-11-23T01:43:17.7841823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7842292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7842892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7843460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7843702Z 2022-11-23T01:43:17.7843813Z Running tests... 2022-11-23T01:43:17.7844226Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7844775Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7845285Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) ... skip: no torchvision (0.002s) 2022-11-23T01:43:17.7845585Z 2022-11-23T01:43:17.7845852Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7846183Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7846348Z 2022-11-23T01:43:17.7846442Z OK (skipped=1) 2022-11-23T01:43:17.7846601Z 2022-11-23T01:43:17.7846727Z Generating XML reports... 2022-11-23T01:43:17.7847351Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012103.xml 2022-11-23T01:43:17.7848100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7848557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7849219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7849721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7849959Z 2022-11-23T01:43:17.7850070Z Running tests... 2022-11-23T01:43:17.7850468Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7851013Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7851484Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7851992Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:43:17.7852316Z 2022-11-23T01:43:17.7852585Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7852916Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7853081Z 2022-11-23T01:43:17.7853190Z OK (skipped=1) 2022-11-23T01:43:17.7853332Z 2022-11-23T01:43:17.7853460Z Generating XML reports... 2022-11-23T01:43:17.7854083Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012105.xml 2022-11-23T01:43:17.7854825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7855279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7855874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7856368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7856609Z 2022-11-23T01:43:17.7856717Z Running tests... 2022-11-23T01:43:17.7857109Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7857651Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7858153Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7858691Z Runs multiple iterations on _test_accumulate_gradients_no_sync ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:43:17.7859016Z 2022-11-23T01:43:17.7859284Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7859619Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7859785Z 2022-11-23T01:43:17.7859895Z OK (skipped=1) 2022-11-23T01:43:17.7860054Z 2022-11-23T01:43:17.7860162Z Generating XML reports... 2022-11-23T01:43:17.7860781Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012108.xml 2022-11-23T01:43:17.7861586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7862054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7862642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7863129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7863368Z 2022-11-23T01:43:17.7863478Z Running tests... 2022-11-23T01:43:17.7863872Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7864412Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7864928Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7865517Z Runs multiple iterations on _test_accumulate_gradients_no_sync using allreduce ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:43:17.7865844Z 2022-11-23T01:43:17.7866115Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7866451Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7866618Z 2022-11-23T01:43:17.7866782Z OK (skipped=1) 2022-11-23T01:43:17.7866950Z 2022-11-23T01:43:17.7867059Z Generating XML reports... 2022-11-23T01:43:17.7867684Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012110.xml 2022-11-23T01:43:17.7868425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7868896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7869482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7869981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7870222Z 2022-11-23T01:43:17.7870332Z Running tests... 2022-11-23T01:43:17.7870729Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7871273Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7871771Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:43:17.7872312Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:43:17.7872631Z 2022-11-23T01:43:17.7872884Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7873218Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7873385Z 2022-11-23T01:43:17.7873494Z OK (skipped=1) 2022-11-23T01:43:17.7873654Z 2022-11-23T01:43:17.7873784Z Generating XML reports... 2022-11-23T01:43:17.7874392Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012113.xml 2022-11-23T01:43:17.7875347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7875828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7876419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7876913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7877153Z 2022-11-23T01:43:17.7877262Z Running tests... 2022-11-23T01:43:17.7877678Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7878200Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7878718Z test_all_gather (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7879303Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2700 2022-11-23T01:43:17.7879757Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2701 2022-11-23T01:43:17.7880392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7880870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7881469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7881945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7882552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7883015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7883601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7884087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7884560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7885161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7885845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7886563Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7887112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7887711Z STAGE:2022-11-23 01:21:19 2701:2701 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7888192Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7888785Z STAGE:2022-11-23 01:21:19 2700:2700 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7889333Z [1669166479.599001] [d8f8c46cdf70:2701 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7889870Z [1669166480.634082] [d8f8c46cdf70:2701 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7890356Z [1669166480.634082] [d8f8c46cdf70:2701 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7890900Z [1669166479.596378] [d8f8c46cdf70:2700 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7891426Z [1669166480.649754] [d8f8c46cdf70:2700 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7891926Z [1669166480.649754] [d8f8c46cdf70:2700 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7892740Z STAGE:2022-11-23 01:21:21 2701:2701 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:21:21 2700:2700 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7893147Z 2022-11-23T01:43:17.7893507Z STAGE:2022-11-23 01:21:21 2700:2700 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7894127Z STAGE:2022-11-23 01:21:21 2701:2701 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7894728Z STAGE:2022-11-23 01:21:21 2701:2701 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7895297Z STAGE:2022-11-23 01:21:21 2700:2700 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7895951Z STAGE:2022-11-23 01:21:21 2701:2701 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7896562Z STAGE:2022-11-23 01:21:21 2701:2701 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7897167Z STAGE:2022-11-23 01:21:21 2700:2700 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7897760Z STAGE:2022-11-23 01:21:21 2700:2700 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7898122Z ok (6.061s) 2022-11-23T01:43:17.7898277Z 2022-11-23T01:43:17.7898544Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7898863Z Ran 1 test in 6.061s 2022-11-23T01:43:17.7899029Z 2022-11-23T01:43:17.7899123Z OK 2022-11-23T01:43:17.7899261Z 2022-11-23T01:43:17.7899387Z Generating XML reports... 2022-11-23T01:43:17.7899992Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012115.xml 2022-11-23T01:43:17.7900740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7901214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7901810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7902336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7902587Z 2022-11-23T01:43:17.7902697Z Running tests... 2022-11-23T01:43:17.7903107Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7903647Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7904195Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:43:17.7904526Z 2022-11-23T01:43:17.7904799Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7905132Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7905299Z 2022-11-23T01:43:17.7905391Z OK (skipped=1) 2022-11-23T01:43:17.7905549Z 2022-11-23T01:43:17.7905673Z Generating XML reports... 2022-11-23T01:43:17.7906297Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012124.xml 2022-11-23T01:43:17.7907039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7907488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7908088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7908576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7908817Z 2022-11-23T01:43:17.7908927Z Running tests... 2022-11-23T01:43:17.7909324Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7909868Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7910178Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:43:17.7910199Z 2022-11-23T01:43:17.7910464Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7910579Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7910599Z 2022-11-23T01:43:17.7910691Z OK (skipped=1) 2022-11-23T01:43:17.7910726Z 2022-11-23T01:43:17.7910835Z Generating XML reports... 2022-11-23T01:43:17.7911297Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012126.xml 2022-11-23T01:43:17.7911686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7911933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7912333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7912533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7912553Z 2022-11-23T01:43:17.7912663Z Running tests... 2022-11-23T01:43:17.7912933Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7913235Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7913535Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:43:17.7913555Z 2022-11-23T01:43:17.7913816Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7913929Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7913949Z 2022-11-23T01:43:17.7914063Z OK (skipped=1) 2022-11-23T01:43:17.7914082Z 2022-11-23T01:43:17.7914207Z Generating XML reports... 2022-11-23T01:43:17.7914665Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012129.xml 2022-11-23T01:43:17.7915251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7915516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7915915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7916114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7916135Z 2022-11-23T01:43:17.7916245Z Running tests... 2022-11-23T01:43:17.7916508Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7916826Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7917135Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:43:17.7917156Z 2022-11-23T01:43:17.7917420Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7917534Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7917554Z 2022-11-23T01:43:17.7917667Z OK (skipped=1) 2022-11-23T01:43:17.7917688Z 2022-11-23T01:43:17.7917798Z Generating XML reports... 2022-11-23T01:43:17.7918253Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012131.xml 2022-11-23T01:43:17.7918636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7918818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7919211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7919414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7919434Z 2022-11-23T01:43:17.7919544Z Running tests... 2022-11-23T01:43:17.7919809Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7920115Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7920423Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.003s) 2022-11-23T01:43:17.7920443Z 2022-11-23T01:43:17.7920711Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7920829Z Ran 1 test in 0.003s 2022-11-23T01:43:17.7920848Z 2022-11-23T01:43:17.7920992Z OK (skipped=1) 2022-11-23T01:43:17.7921013Z 2022-11-23T01:43:17.7921139Z Generating XML reports... 2022-11-23T01:43:17.7921600Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012133.xml 2022-11-23T01:43:17.7922069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7922250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7922633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7922831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7922851Z 2022-11-23T01:43:17.7922960Z Running tests... 2022-11-23T01:43:17.7923225Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7923546Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7923816Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7924047Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2979 2022-11-23T01:43:17.7924273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2980 2022-11-23T01:43:17.7924659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7924826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7925272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7925482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7925863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7926041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7926431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7926632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7926889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7927127Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7927546Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7927963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7928208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7928552Z STAGE:2022-11-23 01:21:40 2979:2979 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7928789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7929127Z STAGE:2022-11-23 01:21:40 2980:2980 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7929420Z [1669166500.146746] [d8f8c46cdf70:2979 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7929668Z [1669166501.182399] [d8f8c46cdf70:2979 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7929905Z [1669166501.182399] [d8f8c46cdf70:2979 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7930195Z [1669166500.167060] [d8f8c46cdf70:2980 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7930436Z [1669166501.187226] [d8f8c46cdf70:2980 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7930748Z [1669166501.187226] [d8f8c46cdf70:2980 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7931314Z STAGE:2022-11-23 01:21:41 2979:2979 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:21:41 2980:2980 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7931336Z 2022-11-23T01:43:17.7931701Z STAGE:2022-11-23 01:21:41 2980:2980 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7932058Z STAGE:2022-11-23 01:21:41 2979:2979 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7932392Z STAGE:2022-11-23 01:21:41 2979:2979 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7932725Z STAGE:2022-11-23 01:21:41 2980:2980 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7933065Z STAGE:2022-11-23 01:21:41 2979:2979 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7933405Z STAGE:2022-11-23 01:21:41 2980:2980 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7933741Z STAGE:2022-11-23 01:21:41 2979:2979 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7934136Z STAGE:2022-11-23 01:21:41 2980:2980 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7934251Z ok (5.771s) 2022-11-23T01:43:17.7934271Z 2022-11-23T01:43:17.7934540Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7934654Z Ran 1 test in 5.771s 2022-11-23T01:43:17.7934674Z 2022-11-23T01:43:17.7934767Z OK 2022-11-23T01:43:17.7934786Z 2022-11-23T01:43:17.7934913Z Generating XML reports... 2022-11-23T01:43:17.7935379Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012136.xml 2022-11-23T01:43:17.7935754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7935944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7936340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7936544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7936564Z 2022-11-23T01:43:17.7936677Z Running tests... 2022-11-23T01:43:17.7936940Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7937261Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7937539Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T01:43:17.7937559Z 2022-11-23T01:43:17.7937818Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7937920Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7937940Z 2022-11-23T01:43:17.7938050Z OK (skipped=1) 2022-11-23T01:43:17.7938069Z 2022-11-23T01:43:17.7938194Z Generating XML reports... 2022-11-23T01:43:17.7938656Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012144.xml 2022-11-23T01:43:17.7939046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7939228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7939623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7939825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7939845Z 2022-11-23T01:43:17.7939955Z Running tests... 2022-11-23T01:43:17.7940248Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7940640Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7940929Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T01:43:17.7940950Z 2022-11-23T01:43:17.7941212Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7941326Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7941351Z 2022-11-23T01:43:17.7941460Z OK (skipped=1) 2022-11-23T01:43:17.7941480Z 2022-11-23T01:43:17.7941605Z Generating XML reports... 2022-11-23T01:43:17.7942065Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012146.xml 2022-11-23T01:43:17.7942435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7942618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7943010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7943213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7943233Z 2022-11-23T01:43:17.7943342Z Running tests... 2022-11-23T01:43:17.7943605Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7943976Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7944257Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7944485Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3159 2022-11-23T01:43:17.7944694Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3160 2022-11-23T01:43:17.7945080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7945264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7945658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7945858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7946235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7946417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7946808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7946990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7947248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7947502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7947926Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7948341Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7948587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7948838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.7949071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7949317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.7949715Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.7950121Z STAGE:2022-11-23 01:21:53 3159:3159 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7950527Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.7950862Z STAGE:2022-11-23 01:21:53 3160:3160 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7951160Z [1669166513.346680] [d8f8c46cdf70:3159 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7951406Z [1669166514.398335] [d8f8c46cdf70:3159 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7951662Z [1669166514.398335] [d8f8c46cdf70:3159 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7951949Z [1669166513.368696] [d8f8c46cdf70:3160 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7952194Z [1669166514.398319] [d8f8c46cdf70:3160 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7952446Z [1669166514.398319] [d8f8c46cdf70:3160 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7953066Z STAGE:2022-11-23 01:21:54 3159:3159 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:21:54 3160:3160 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7953091Z 2022-11-23T01:43:17.7953666Z STAGE:2022-11-23 01:21:54 3160:3160 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:21:54 3159:3159 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7953704Z 2022-11-23T01:43:17.7954020Z STAGE:2022-11-23 01:21:54 3160:3160 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7954359Z STAGE:2022-11-23 01:21:54 3159:3159 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.7954701Z STAGE:2022-11-23 01:21:54 3160:3160 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7955310Z STAGE:2022-11-23 01:21:54 3159:3159 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.7955686Z STAGE:2022-11-23 01:21:54 3160:3160 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7956036Z STAGE:2022-11-23 01:21:54 3159:3159 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.7956141Z ok (5.979s) 2022-11-23T01:43:17.7956162Z 2022-11-23T01:43:17.7956430Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7956528Z Ran 1 test in 5.979s 2022-11-23T01:43:17.7956565Z 2022-11-23T01:43:17.7956641Z OK 2022-11-23T01:43:17.7956661Z 2022-11-23T01:43:17.7956787Z Generating XML reports... 2022-11-23T01:43:17.7957260Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012149.xml 2022-11-23T01:43:17.7957652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7957835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7958233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7958434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7958454Z 2022-11-23T01:43:17.7958563Z Running tests... 2022-11-23T01:43:17.7958813Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7959136Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7959404Z test_all_gather_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7959728Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3273 2022-11-23T01:43:17.7959952Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3274 2022-11-23T01:43:17.7960341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7960529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7960929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7961129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7961495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7961674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7962079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7962280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7962536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7962863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7963292Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7963706Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7963930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7964168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7964335Z skip: Skipped due to small world size. (4.257s) 2022-11-23T01:43:17.7964357Z 2022-11-23T01:43:17.7964630Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7964745Z Ran 1 test in 4.258s 2022-11-23T01:43:17.7964765Z 2022-11-23T01:43:17.7964872Z OK (skipped=1) 2022-11-23T01:43:17.7964891Z 2022-11-23T01:43:17.7965018Z Generating XML reports... 2022-11-23T01:43:17.7965481Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012157.xml 2022-11-23T01:43:17.7965869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7966036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7966431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7966628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7966652Z 2022-11-23T01:43:17.7966762Z Running tests... 2022-11-23T01:43:17.7967028Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7967348Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7967664Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T01:43:17.7967684Z 2022-11-23T01:43:17.7967946Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7968062Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7968083Z 2022-11-23T01:43:17.7968173Z OK (skipped=1) 2022-11-23T01:43:17.7968192Z 2022-11-23T01:43:17.7968317Z Generating XML reports... 2022-11-23T01:43:17.7968779Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012204.xml 2022-11-23T01:43:17.7969230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7969412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7969808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7970010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7970031Z 2022-11-23T01:43:17.7970140Z Running tests... 2022-11-23T01:43:17.7970388Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7970710Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7971022Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T01:43:17.7971044Z 2022-11-23T01:43:17.7971304Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7971422Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7971443Z 2022-11-23T01:43:17.7971550Z OK (skipped=1) 2022-11-23T01:43:17.7971570Z 2022-11-23T01:43:17.7971695Z Generating XML reports... 2022-11-23T01:43:17.7972154Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012207.xml 2022-11-23T01:43:17.7972591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7972767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7973165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7973363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7973384Z 2022-11-23T01:43:17.7973495Z Running tests... 2022-11-23T01:43:17.7973761Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7974086Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7974384Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T01:43:17.7974404Z 2022-11-23T01:43:17.7974667Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7974781Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7974801Z 2022-11-23T01:43:17.7974892Z OK (skipped=1) 2022-11-23T01:43:17.7974912Z 2022-11-23T01:43:17.7975039Z Generating XML reports... 2022-11-23T01:43:17.7975496Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012209.xml 2022-11-23T01:43:17.7975881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7976063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7976461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7976659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7976680Z 2022-11-23T01:43:17.7976790Z Running tests... 2022-11-23T01:43:17.7977059Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7977365Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7977678Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T01:43:17.7977698Z 2022-11-23T01:43:17.7977961Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7978075Z Ran 1 test in 0.002s 2022-11-23T01:43:17.7978095Z 2022-11-23T01:43:17.7978204Z OK (skipped=1) 2022-11-23T01:43:17.7978224Z 2022-11-23T01:43:17.7978410Z Generating XML reports... 2022-11-23T01:43:17.7978872Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012211.xml 2022-11-23T01:43:17.7979256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7979443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7979821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7980020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7980040Z 2022-11-23T01:43:17.7980149Z Running tests... 2022-11-23T01:43:17.7980414Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7980734Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7981021Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7981256Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3508 2022-11-23T01:43:17.7981479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3509 2022-11-23T01:43:17.7981901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7982093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7982489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7982687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7983066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7983248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7983639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7983836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7984093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7984337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7984757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7985171Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7985413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7985650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7985943Z [1669166538.199090] [d8f8c46cdf70:3508 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7986186Z [1669166538.994952] [d8f8c46cdf70:3508 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7986440Z [1669166538.994952] [d8f8c46cdf70:3508 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7986728Z [1669166538.221772] [d8f8c46cdf70:3509 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.7986953Z [1669166538.983632] [d8f8c46cdf70:3509 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.7987204Z [1669166538.983632] [d8f8c46cdf70:3509 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.7987370Z ok (5.927s) 2022-11-23T01:43:17.7987391Z 2022-11-23T01:43:17.7987663Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7987777Z Ran 1 test in 5.927s 2022-11-23T01:43:17.7987798Z 2022-11-23T01:43:17.7987891Z OK 2022-11-23T01:43:17.7987911Z 2022-11-23T01:43:17.7988039Z Generating XML reports... 2022-11-23T01:43:17.7988506Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012214.xml 2022-11-23T01:43:17.7988894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7989060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7989453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7989652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7989676Z 2022-11-23T01:43:17.7989786Z Running tests... 2022-11-23T01:43:17.7990054Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.7990376Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.7990707Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.7990943Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3619 2022-11-23T01:43:17.7991166Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3620 2022-11-23T01:43:17.7991535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7991718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7992109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7992313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7992692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.7992871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.7993269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.7993465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.7993706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.7993960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.7994376Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7994793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.7995245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.7995494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.7995747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.7996000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.7996418Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.7996810Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.7997061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.7997400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.7997809Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.7998223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.7998474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T01:43:17.7998722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T01:43:17.7999125Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T01:43:17.7999532Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T01:43:17.7999828Z [1669166546.766299] [d8f8c46cdf70:3620 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8000058Z [1669166547.547398] [d8f8c46cdf70:3620 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8000368Z [1669166547.547398] [d8f8c46cdf70:3620 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8000670Z [1669166546.743616] [d8f8c46cdf70:3619 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8000912Z [1669166547.538070] [d8f8c46cdf70:3619 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8001162Z [1669166547.538070] [d8f8c46cdf70:3619 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8001270Z ok (6.446s) 2022-11-23T01:43:17.8001291Z 2022-11-23T01:43:17.8001568Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8001682Z Ran 1 test in 6.446s 2022-11-23T01:43:17.8001702Z 2022-11-23T01:43:17.8001794Z OK 2022-11-23T01:43:17.8001814Z 2022-11-23T01:43:17.8001924Z Generating XML reports... 2022-11-23T01:43:17.8002391Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012222.xml 2022-11-23T01:43:17.8002785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8002969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8003364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8003564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8003588Z 2022-11-23T01:43:17.8003700Z Running tests... 2022-11-23T01:43:17.8003966Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8004288Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8004550Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports all_gather_v (0.002s) 2022-11-23T01:43:17.8004570Z 2022-11-23T01:43:17.8004831Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8004948Z Ran 1 test in 0.003s 2022-11-23T01:43:17.8004968Z 2022-11-23T01:43:17.8005077Z OK (skipped=1) 2022-11-23T01:43:17.8005097Z 2022-11-23T01:43:17.8005221Z Generating XML reports... 2022-11-23T01:43:17.8005683Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012231.xml 2022-11-23T01:43:17.8006072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8006315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8006713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8006900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8006920Z 2022-11-23T01:43:17.8007029Z Running tests... 2022-11-23T01:43:17.8007297Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8007622Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8008059Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8008079Z 2022-11-23T01:43:17.8008339Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8008457Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8008477Z 2022-11-23T01:43:17.8008585Z OK (skipped=1) 2022-11-23T01:43:17.8008605Z 2022-11-23T01:43:17.8008713Z Generating XML reports... 2022-11-23T01:43:17.8009172Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012234.xml 2022-11-23T01:43:17.8009605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8009797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8010193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8010392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8010412Z 2022-11-23T01:43:17.8010520Z Running tests... 2022-11-23T01:43:17.8010784Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8011110Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8011529Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8011565Z 2022-11-23T01:43:17.8011812Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8011925Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8011944Z 2022-11-23T01:43:17.8012053Z OK (skipped=1) 2022-11-23T01:43:17.8012073Z 2022-11-23T01:43:17.8012197Z Generating XML reports... 2022-11-23T01:43:17.8012652Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012236.xml 2022-11-23T01:43:17.8013033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8013214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8013608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8013792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8013811Z 2022-11-23T01:43:17.8013921Z Running tests... 2022-11-23T01:43:17.8014189Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8014512Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8014958Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8014979Z 2022-11-23T01:43:17.8015242Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8015355Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8015375Z 2022-11-23T01:43:17.8015483Z OK (skipped=1) 2022-11-23T01:43:17.8015555Z 2022-11-23T01:43:17.8015687Z Generating XML reports... 2022-11-23T01:43:17.8016132Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012238.xml 2022-11-23T01:43:17.8016513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8016698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8017091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8017291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8017311Z 2022-11-23T01:43:17.8017420Z Running tests... 2022-11-23T01:43:17.8017685Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8018005Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8018427Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8018466Z 2022-11-23T01:43:17.8018713Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8018827Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8018847Z 2022-11-23T01:43:17.8018954Z OK (skipped=1) 2022-11-23T01:43:17.8018973Z 2022-11-23T01:43:17.8019148Z Generating XML reports... 2022-11-23T01:43:17.8019616Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012241.xml 2022-11-23T01:43:17.8020000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8020182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8020574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8020762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8020799Z 2022-11-23T01:43:17.8020892Z Running tests... 2022-11-23T01:43:17.8021199Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8021519Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8021947Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8021967Z 2022-11-23T01:43:17.8022229Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8022342Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8022362Z 2022-11-23T01:43:17.8022470Z OK (skipped=1) 2022-11-23T01:43:17.8022489Z 2022-11-23T01:43:17.8022614Z Generating XML reports... 2022-11-23T01:43:17.8023056Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012243.xml 2022-11-23T01:43:17.8023451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8023631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8024024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8024229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8024249Z 2022-11-23T01:43:17.8024359Z Running tests... 2022-11-23T01:43:17.8024622Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8024943Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8025365Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8025386Z 2022-11-23T01:43:17.8025693Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8025810Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8025830Z 2022-11-23T01:43:17.8025940Z OK (skipped=1) 2022-11-23T01:43:17.8025959Z 2022-11-23T01:43:17.8026084Z Generating XML reports... 2022-11-23T01:43:17.8026548Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012246.xml 2022-11-23T01:43:17.8026933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8027115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8027508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8027707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8027727Z 2022-11-23T01:43:17.8027820Z Running tests... 2022-11-23T01:43:17.8028089Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8028412Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8028846Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8028868Z 2022-11-23T01:43:17.8029187Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8029308Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8029328Z 2022-11-23T01:43:17.8029436Z OK (skipped=1) 2022-11-23T01:43:17.8029455Z 2022-11-23T01:43:17.8029581Z Generating XML reports... 2022-11-23T01:43:17.8030026Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012248.xml 2022-11-23T01:43:17.8030412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8030601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8030995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8031194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8031214Z 2022-11-23T01:43:17.8031324Z Running tests... 2022-11-23T01:43:17.8031592Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8031913Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8032334Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8032355Z 2022-11-23T01:43:17.8032619Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8032716Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8032735Z 2022-11-23T01:43:17.8032847Z OK (skipped=1) 2022-11-23T01:43:17.8032866Z 2022-11-23T01:43:17.8032991Z Generating XML reports... 2022-11-23T01:43:17.8033446Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012250.xml 2022-11-23T01:43:17.8033830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8034015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8034410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8034609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8034629Z 2022-11-23T01:43:17.8034722Z Running tests... 2022-11-23T01:43:17.8034989Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8035596Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8036098Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8036120Z 2022-11-23T01:43:17.8036381Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8036496Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8036515Z 2022-11-23T01:43:17.8036626Z OK (skipped=1) 2022-11-23T01:43:17.8036647Z 2022-11-23T01:43:17.8036770Z Generating XML reports... 2022-11-23T01:43:17.8037227Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012253.xml 2022-11-23T01:43:17.8037596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8037778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8038168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8038369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8038388Z 2022-11-23T01:43:17.8038497Z Running tests... 2022-11-23T01:43:17.8038762Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8039141Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8039466Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8039678Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4081 2022-11-23T01:43:17.8039900Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4082 2022-11-23T01:43:17.8040287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8040470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8040869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8041068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8041446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8041632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8042020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8042199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8042458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8042711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8043127Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8043537Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8043776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8044554Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T01:43:17.8044669Z warnings.warn( 2022-11-23T01:43:17.8044907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8045671Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T01:43:17.8045830Z warnings.warn( 2022-11-23T01:43:17.8045933Z ok (4.267s) 2022-11-23T01:43:17.8045953Z 2022-11-23T01:43:17.8046224Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8046344Z Ran 1 test in 4.267s 2022-11-23T01:43:17.8046364Z 2022-11-23T01:43:17.8046457Z OK 2022-11-23T01:43:17.8046476Z 2022-11-23T01:43:17.8046603Z Generating XML reports... 2022-11-23T01:43:17.8047063Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012255.xml 2022-11-23T01:43:17.8047448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8047614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8048011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8048213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8048233Z 2022-11-23T01:43:17.8048343Z Running tests... 2022-11-23T01:43:17.8048612Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8048990Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8049416Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8049436Z 2022-11-23T01:43:17.8049701Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8049815Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8049835Z 2022-11-23T01:43:17.8049927Z OK (skipped=1) 2022-11-23T01:43:17.8049946Z 2022-11-23T01:43:17.8050071Z Generating XML reports... 2022-11-23T01:43:17.8050536Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012302.xml 2022-11-23T01:43:17.8050924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8051105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8051497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8051696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8051716Z 2022-11-23T01:43:17.8051825Z Running tests... 2022-11-23T01:43:17.8052088Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8052392Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8052815Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8052839Z 2022-11-23T01:43:17.8053103Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8053218Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8053238Z 2022-11-23T01:43:17.8053345Z OK (skipped=1) 2022-11-23T01:43:17.8053364Z 2022-11-23T01:43:17.8053490Z Generating XML reports... 2022-11-23T01:43:17.8053953Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012304.xml 2022-11-23T01:43:17.8054340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8054524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8054904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8055105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8055177Z 2022-11-23T01:43:17.8055294Z Running tests... 2022-11-23T01:43:17.8055560Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8055879Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8056295Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.8056315Z 2022-11-23T01:43:17.8056578Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8056692Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8056711Z 2022-11-23T01:43:17.8056803Z OK (skipped=1) 2022-11-23T01:43:17.8056839Z 2022-11-23T01:43:17.8056948Z Generating XML reports... 2022-11-23T01:43:17.8057403Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012307.xml 2022-11-23T01:43:17.8057793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8057974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8058365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8058608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8058630Z 2022-11-23T01:43:17.8058747Z Running tests... 2022-11-23T01:43:17.8059014Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8059316Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8059616Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8059843Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4283 2022-11-23T01:43:17.8060075Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4284 2022-11-23T01:43:17.8060456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8060638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8061039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8061239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8061600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8061781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8062167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8062363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8062626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8062879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8063297Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8063707Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8063949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8064171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8064276Z ok (4.275s) 2022-11-23T01:43:17.8064295Z 2022-11-23T01:43:17.8064562Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8064738Z Ran 1 test in 4.276s 2022-11-23T01:43:17.8064759Z 2022-11-23T01:43:17.8064853Z OK 2022-11-23T01:43:17.8064872Z 2022-11-23T01:43:17.8064998Z Generating XML reports... 2022-11-23T01:43:17.8065458Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012309.xml 2022-11-23T01:43:17.8065844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8066028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8066404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8066603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8066623Z 2022-11-23T01:43:17.8066732Z Running tests... 2022-11-23T01:43:17.8066997Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8067325Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8067607Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8067834Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4386 2022-11-23T01:43:17.8068107Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4387 2022-11-23T01:43:17.8068485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8068668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8069061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8069260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8069637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8069823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8070214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8070414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8070672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8070911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8071329Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8071736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8071974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8072231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8072462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8072712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8073126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8073530Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8073856Z STAGE:2022-11-23 01:23:20 4387:4387 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8074192Z STAGE:2022-11-23 01:23:20 4386:4386 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8074541Z [1669166600.477231] [d8f8c46cdf70:4387 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8074787Z [1669166601.515184] [d8f8c46cdf70:4387 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8075249Z [1669166601.515184] [d8f8c46cdf70:4387 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8075547Z [1669166600.456616] [d8f8c46cdf70:4386 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8075786Z [1669166601.485433] [d8f8c46cdf70:4386 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8076035Z [1669166601.485433] [d8f8c46cdf70:4386 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8076611Z STAGE:2022-11-23 01:23:21 4387:4387 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:23:21 4386:4386 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8076634Z 2022-11-23T01:43:17.8076993Z STAGE:2022-11-23 01:23:21 4387:4387 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8077420Z STAGE:2022-11-23 01:23:21 4386:4386 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8077752Z STAGE:2022-11-23 01:23:21 4387:4387 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8078080Z STAGE:2022-11-23 01:23:21 4386:4386 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8078417Z STAGE:2022-11-23 01:23:21 4387:4387 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8078750Z STAGE:2022-11-23 01:23:21 4386:4386 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8079103Z STAGE:2022-11-23 01:23:21 4387:4387 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8079445Z STAGE:2022-11-23 01:23:21 4386:4386 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8079549Z ok (5.805s) 2022-11-23T01:43:17.8079570Z 2022-11-23T01:43:17.8079840Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8079939Z Ran 1 test in 5.805s 2022-11-23T01:43:17.8079958Z 2022-11-23T01:43:17.8080051Z OK 2022-11-23T01:43:17.8080070Z 2022-11-23T01:43:17.8080191Z Generating XML reports... 2022-11-23T01:43:17.8080649Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012316.xml 2022-11-23T01:43:17.8081037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8081216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8081611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8081809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8081829Z 2022-11-23T01:43:17.8081937Z Running tests... 2022-11-23T01:43:17.8082187Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8082505Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8082778Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8082999Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4500 2022-11-23T01:43:17.8083216Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4501 2022-11-23T01:43:17.8083594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8083910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8084300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8084480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8084852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8085029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8085419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8085608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8085860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8086108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8086523Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8086929Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8087202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8087440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8087684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8087929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8088339Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8088746Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8089077Z STAGE:2022-11-23 01:23:28 4500:4500 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8089409Z STAGE:2022-11-23 01:23:28 4501:4501 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8089697Z [1669166608.729738] [d8f8c46cdf70:4500 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8089926Z [1669166609.752139] [d8f8c46cdf70:4500 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8090173Z [1669166609.752139] [d8f8c46cdf70:4500 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8090454Z [1669166608.750106] [d8f8c46cdf70:4501 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8090691Z [1669166609.781782] [d8f8c46cdf70:4501 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8090937Z [1669166609.781782] [d8f8c46cdf70:4501 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8091499Z STAGE:2022-11-23 01:23:30 4500:4500 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:23:30 4501:4501 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8091520Z 2022-11-23T01:43:17.8092102Z STAGE:2022-11-23 01:23:30 4500:4500 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:23:30 4501:4501 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8092121Z 2022-11-23T01:43:17.8092449Z STAGE:2022-11-23 01:23:30 4500:4500 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8092840Z STAGE:2022-11-23 01:23:30 4501:4501 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8093178Z STAGE:2022-11-23 01:23:30 4500:4500 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8093511Z STAGE:2022-11-23 01:23:30 4501:4501 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8093846Z STAGE:2022-11-23 01:23:30 4500:4500 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8094190Z STAGE:2022-11-23 01:23:30 4501:4501 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8094288Z ok (5.721s) 2022-11-23T01:43:17.8094307Z 2022-11-23T01:43:17.8094571Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8094680Z Ran 1 test in 5.721s 2022-11-23T01:43:17.8094701Z 2022-11-23T01:43:17.8094788Z OK 2022-11-23T01:43:17.8094807Z 2022-11-23T01:43:17.8094932Z Generating XML reports... 2022-11-23T01:43:17.8095391Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012324.xml 2022-11-23T01:43:17.8095774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8096024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8096428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8096621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8096641Z 2022-11-23T01:43:17.8096744Z Running tests... 2022-11-23T01:43:17.8097007Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8097323Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8097605Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8097833Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4614 2022-11-23T01:43:17.8098041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4615 2022-11-23T01:43:17.8098425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8098601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8098993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8099186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8099555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8099730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8100123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8100318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8100558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8100811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8101225Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8101630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8101863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8102113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8102402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8102640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8103055Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8103448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8103783Z STAGE:2022-11-23 01:23:37 4615:4615 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8104114Z STAGE:2022-11-23 01:23:37 4614:4614 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8104400Z [1669166617.173804] [d8f8c46cdf70:4615 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8104648Z [1669166618.194905] [d8f8c46cdf70:4615 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8104895Z [1669166618.194905] [d8f8c46cdf70:4615 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8105229Z [1669166617.152238] [d8f8c46cdf70:4614 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8105476Z [1669166618.194961] [d8f8c46cdf70:4614 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8105720Z [1669166618.194961] [d8f8c46cdf70:4614 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8106286Z STAGE:2022-11-23 01:23:38 4615:4615 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:23:38 4614:4614 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8106311Z 2022-11-23T01:43:17.8106892Z STAGE:2022-11-23 01:23:38 4615:4615 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:23:38 4614:4614 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8106912Z 2022-11-23T01:43:17.8107245Z STAGE:2022-11-23 01:23:38 4614:4614 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8107562Z STAGE:2022-11-23 01:23:38 4615:4615 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8107896Z STAGE:2022-11-23 01:23:38 4614:4614 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8108225Z STAGE:2022-11-23 01:23:38 4615:4615 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8108576Z STAGE:2022-11-23 01:23:38 4614:4614 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8108924Z STAGE:2022-11-23 01:23:38 4615:4615 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8109027Z ok (5.871s) 2022-11-23T01:43:17.8109047Z 2022-11-23T01:43:17.8109312Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8109427Z Ran 1 test in 5.871s 2022-11-23T01:43:17.8109446Z 2022-11-23T01:43:17.8109523Z OK 2022-11-23T01:43:17.8109546Z 2022-11-23T01:43:17.8109662Z Generating XML reports... 2022-11-23T01:43:17.8110121Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012333.xml 2022-11-23T01:43:17.8110503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8110684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8111074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8111335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8111355Z 2022-11-23T01:43:17.8111463Z Running tests... 2022-11-23T01:43:17.8111727Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8112031Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8112313Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8112532Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4728 2022-11-23T01:43:17.8112754Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4729 2022-11-23T01:43:17.8113139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8113317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8113709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8113903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8114263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8114486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8114889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8115294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8115560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8115812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8116231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8116649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8116890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8117130Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8117363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8117605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8118016Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8118350Z STAGE:2022-11-23 01:23:45 4728:4728 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8118750Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8119083Z STAGE:2022-11-23 01:23:45 4729:4729 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8119377Z [1669166625.560767] [d8f8c46cdf70:4729 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8119619Z [1669166626.597801] [d8f8c46cdf70:4729 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8119855Z [1669166626.597801] [d8f8c46cdf70:4729 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8120137Z [1669166625.538873] [d8f8c46cdf70:4728 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8120372Z [1669166626.595590] [d8f8c46cdf70:4728 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8120702Z [1669166626.595590] [d8f8c46cdf70:4728 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8121301Z STAGE:2022-11-23 01:23:46 4729:4729 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:23:46 4728:4728 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8121324Z 2022-11-23T01:43:17.8121901Z STAGE:2022-11-23 01:23:46 4729:4729 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:23:46 4728:4728 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8121922Z 2022-11-23T01:43:17.8122249Z STAGE:2022-11-23 01:23:47 4729:4729 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8122578Z STAGE:2022-11-23 01:23:47 4728:4728 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8122920Z STAGE:2022-11-23 01:23:47 4729:4729 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8123481Z STAGE:2022-11-23 01:23:47 4729:4729 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:23:47 4728:4728 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8123502Z 2022-11-23T01:43:17.8123912Z STAGE:2022-11-23 01:23:47 4728:4728 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8124023Z ok (5.901s) 2022-11-23T01:43:17.8124042Z 2022-11-23T01:43:17.8124299Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8124413Z Ran 1 test in 5.901s 2022-11-23T01:43:17.8124433Z 2022-11-23T01:43:17.8124526Z OK 2022-11-23T01:43:17.8124545Z 2022-11-23T01:43:17.8124667Z Generating XML reports... 2022-11-23T01:43:17.8125125Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012341.xml 2022-11-23T01:43:17.8125512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8125688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8126082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8126276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8126297Z 2022-11-23T01:43:17.8126390Z Running tests... 2022-11-23T01:43:17.8126647Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8126967Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8127234Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8127455Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4842 2022-11-23T01:43:17.8127679Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4843 2022-11-23T01:43:17.8128063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8128239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8128625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8128818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8129193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8129368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8129753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8130006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8130257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8130506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8130923Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8131318Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8131554Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8131790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8131951Z skip: Skipped due to small world size. (4.247s) 2022-11-23T01:43:17.8131975Z 2022-11-23T01:43:17.8132236Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8132352Z Ran 1 test in 4.247s 2022-11-23T01:43:17.8132372Z 2022-11-23T01:43:17.8132473Z OK (skipped=1) 2022-11-23T01:43:17.8132493Z 2022-11-23T01:43:17.8132615Z Generating XML reports... 2022-11-23T01:43:17.8133109Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012350.xml 2022-11-23T01:43:17.8133506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8133685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8134072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8134269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8134289Z 2022-11-23T01:43:17.8134400Z Running tests... 2022-11-23T01:43:17.8134661Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8134976Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8135244Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8135457Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4945 2022-11-23T01:43:17.8135676Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4946 2022-11-23T01:43:17.8136058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8136235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8136625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8136817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8137189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8137366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8137739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8137929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8138179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8138429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8138838Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8139245Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8139547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8139782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8139948Z skip: Skipped due to small world size. (4.244s) 2022-11-23T01:43:17.8139968Z 2022-11-23T01:43:17.8140225Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8140334Z Ran 1 test in 4.244s 2022-11-23T01:43:17.8140354Z 2022-11-23T01:43:17.8140456Z OK (skipped=1) 2022-11-23T01:43:17.8140476Z 2022-11-23T01:43:17.8140601Z Generating XML reports... 2022-11-23T01:43:17.8141054Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012356.xml 2022-11-23T01:43:17.8141436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8141619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8142013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8142211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8142231Z 2022-11-23T01:43:17.8142325Z Running tests... 2022-11-23T01:43:17.8142637Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8142968Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8143244Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8143467Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5048 2022-11-23T01:43:17.8143687Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5049 2022-11-23T01:43:17.8144072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8144253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8144628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8144823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8145195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8145369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8145754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8145948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8146201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8146453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8146865Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8147264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8147504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8147736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8147891Z skip: Skipped due to small world size. (4.210s) 2022-11-23T01:43:17.8147911Z 2022-11-23T01:43:17.8148174Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8148283Z Ran 1 test in 4.210s 2022-11-23T01:43:17.8148357Z 2022-11-23T01:43:17.8148467Z OK (skipped=1) 2022-11-23T01:43:17.8148488Z 2022-11-23T01:43:17.8148609Z Generating XML reports... 2022-11-23T01:43:17.8149071Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012403.xml 2022-11-23T01:43:17.8149444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8149625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8150016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8150213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8150233Z 2022-11-23T01:43:17.8150338Z Running tests... 2022-11-23T01:43:17.8150598Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8150916Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8151188Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8151399Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5151 2022-11-23T01:43:17.8151618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5152 2022-11-23T01:43:17.8152051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8152242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8152633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8152827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8153200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8153378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8153764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8153943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8154201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8154448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8154861Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8155473Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8155716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8155952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8156110Z skip: Skipped due to small world size. (4.253s) 2022-11-23T01:43:17.8156132Z 2022-11-23T01:43:17.8156400Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8156498Z Ran 1 test in 4.253s 2022-11-23T01:43:17.8156517Z 2022-11-23T01:43:17.8156623Z OK (skipped=1) 2022-11-23T01:43:17.8156643Z 2022-11-23T01:43:17.8156767Z Generating XML reports... 2022-11-23T01:43:17.8157226Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012410.xml 2022-11-23T01:43:17.8157609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8157794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8158184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8158474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8158494Z 2022-11-23T01:43:17.8158588Z Running tests... 2022-11-23T01:43:17.8158857Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8159186Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8159444Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8159667Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5254 2022-11-23T01:43:17.8159885Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5255 2022-11-23T01:43:17.8160270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8160447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8160840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8161023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8161402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8161654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8162051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8162244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8162496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8162742Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8163157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8163549Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8163784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8164126Z STAGE:2022-11-23 01:24:21 5254:5254 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8164357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8164683Z STAGE:2022-11-23 01:24:21 5255:5255 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8164973Z [1669166661.222803] [d8f8c46cdf70:5255 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8165214Z [1669166662.251249] [d8f8c46cdf70:5255 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8165473Z [1669166662.251249] [d8f8c46cdf70:5255 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8165763Z [1669166661.202337] [d8f8c46cdf70:5254 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8166001Z [1669166662.218986] [d8f8c46cdf70:5254 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8166236Z [1669166662.218986] [d8f8c46cdf70:5254 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8166802Z STAGE:2022-11-23 01:24:22 5255:5255 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:22 5254:5254 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8166874Z 2022-11-23T01:43:17.8167468Z STAGE:2022-11-23 01:24:22 5255:5255 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:24:22 5254:5254 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8167488Z 2022-11-23T01:43:17.8167819Z STAGE:2022-11-23 01:24:22 5254:5254 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8168159Z STAGE:2022-11-23 01:24:22 5255:5255 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8168499Z STAGE:2022-11-23 01:24:22 5254:5254 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8168833Z STAGE:2022-11-23 01:24:22 5255:5255 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8169184Z STAGE:2022-11-23 01:24:22 5254:5254 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8169533Z STAGE:2022-11-23 01:24:22 5255:5255 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8169636Z ok (5.873s) 2022-11-23T01:43:17.8169656Z 2022-11-23T01:43:17.8169926Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8170024Z Ran 1 test in 5.873s 2022-11-23T01:43:17.8170044Z 2022-11-23T01:43:17.8170134Z OK 2022-11-23T01:43:17.8170154Z 2022-11-23T01:43:17.8170281Z Generating XML reports... 2022-11-23T01:43:17.8170792Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012417.xml 2022-11-23T01:43:17.8171184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8171365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8171756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8171955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8171979Z 2022-11-23T01:43:17.8172074Z Running tests... 2022-11-23T01:43:17.8172340Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8172659Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8172920Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8173143Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5368 2022-11-23T01:43:17.8173364Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5369 2022-11-23T01:43:17.8173746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8173926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8174321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8174510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8174888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8175066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8175451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8175644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8175897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8176147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8176564Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8177023Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8177261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8177494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8177838Z STAGE:2022-11-23 01:24:29 5368:5368 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8178168Z STAGE:2022-11-23 01:24:29 5369:5369 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8178454Z [1669166669.467113] [d8f8c46cdf70:5369 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8178695Z [1669166670.491548] [d8f8c46cdf70:5369 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8178952Z [1669166670.491548] [d8f8c46cdf70:5369 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8179237Z [1669166669.446550] [d8f8c46cdf70:5368 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8179522Z [1669166670.478635] [d8f8c46cdf70:5368 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8179769Z [1669166670.478635] [d8f8c46cdf70:5368 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8180335Z STAGE:2022-11-23 01:24:30 5369:5369 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:30 5368:5368 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8180355Z 2022-11-23T01:43:17.8180935Z STAGE:2022-11-23 01:24:30 5368:5368 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:24:30 5369:5369 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8180960Z 2022-11-23T01:43:17.8181298Z STAGE:2022-11-23 01:24:30 5369:5369 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8181628Z STAGE:2022-11-23 01:24:30 5368:5368 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8181973Z STAGE:2022-11-23 01:24:30 5369:5369 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8182536Z STAGE:2022-11-23 01:24:30 5368:5368 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:30 5369:5369 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8182556Z 2022-11-23T01:43:17.8182908Z STAGE:2022-11-23 01:24:30 5368:5368 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8183017Z ok (5.770s) 2022-11-23T01:43:17.8183037Z 2022-11-23T01:43:17.8183310Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8183424Z Ran 1 test in 5.771s 2022-11-23T01:43:17.8183448Z 2022-11-23T01:43:17.8183524Z OK 2022-11-23T01:43:17.8183543Z 2022-11-23T01:43:17.8183666Z Generating XML reports... 2022-11-23T01:43:17.8184125Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012425.xml 2022-11-23T01:43:17.8184512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8184697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8185090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8185285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8185304Z 2022-11-23T01:43:17.8185416Z Running tests... 2022-11-23T01:43:17.8185737Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8186045Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8186317Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8186539Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5482 2022-11-23T01:43:17.8186766Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5483 2022-11-23T01:43:17.8187155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8187337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8187730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8187929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8188295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8188473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8188857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8189101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8189366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8189618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8190034Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8190441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8190683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8190907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8191248Z STAGE:2022-11-23 01:24:38 5483:5483 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8192053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:43:17.8192170Z warnings.warn( 2022-11-23T01:43:17.8192508Z STAGE:2022-11-23 01:24:38 5482:5482 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8193295Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:43:17.8193413Z warnings.warn( 2022-11-23T01:43:17.8193705Z [1669166678.897785] [d8f8c46cdf70:5483 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8193952Z [1669166678.906853] [d8f8c46cdf70:5483 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8194202Z [1669166678.906853] [d8f8c46cdf70:5483 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8194538Z STAGE:2022-11-23 01:24:39 5483:5483 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8194825Z [1669166678.893251] [d8f8c46cdf70:5482 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8195334Z [1669166678.902870] [d8f8c46cdf70:5482 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8195590Z [1669166678.902870] [d8f8c46cdf70:5482 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8195948Z STAGE:2022-11-23 01:24:39 5482:5482 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8196300Z STAGE:2022-11-23 01:24:39 5483:5483 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8196644Z STAGE:2022-11-23 01:24:39 5482:5482 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8196976Z STAGE:2022-11-23 01:24:39 5482:5482 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8197310Z STAGE:2022-11-23 01:24:39 5482:5482 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8197659Z STAGE:2022-11-23 01:24:39 5482:5482 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8197971Z STAGE:2022-11-23 01:24:39 5483:5483 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8198307Z STAGE:2022-11-23 01:24:39 5483:5483 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8198729Z STAGE:2022-11-23 01:24:39 5483:5483 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8198843Z ok (6.050s) 2022-11-23T01:43:17.8198865Z 2022-11-23T01:43:17.8199132Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8199245Z Ran 1 test in 6.051s 2022-11-23T01:43:17.8199265Z 2022-11-23T01:43:17.8199358Z OK 2022-11-23T01:43:17.8199377Z 2022-11-23T01:43:17.8199502Z Generating XML reports... 2022-11-23T01:43:17.8199950Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012433.xml 2022-11-23T01:43:17.8200342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8200523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8200917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8201117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8201137Z 2022-11-23T01:43:17.8201241Z Running tests... 2022-11-23T01:43:17.8201505Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8201822Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8202106Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8202316Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5600 2022-11-23T01:43:17.8202546Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5601 2022-11-23T01:43:17.8202933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8203113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8203509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8203705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8204085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8204266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8204637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8204905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8205161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8205414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8205838Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8206246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8206483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8206718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8207061Z STAGE:2022-11-23 01:24:47 5600:5600 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8207849Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:43:17.8207965Z warnings.warn( 2022-11-23T01:43:17.8208350Z STAGE:2022-11-23 01:24:47 5601:5601 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8209154Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:43:17.8209266Z warnings.warn( 2022-11-23T01:43:17.8209551Z [1669166687.501741] [d8f8c46cdf70:5600 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8209799Z [1669166687.511113] [d8f8c46cdf70:5600 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8210046Z [1669166687.511113] [d8f8c46cdf70:5600 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8210396Z STAGE:2022-11-23 01:24:47 5600:5600 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8210685Z [1669166687.510426] [d8f8c46cdf70:5601 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8210923Z [1669166687.519687] [d8f8c46cdf70:5601 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8211156Z [1669166687.519687] [d8f8c46cdf70:5601 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8211499Z STAGE:2022-11-23 01:24:47 5601:5601 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8211857Z STAGE:2022-11-23 01:24:47 5600:5600 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8212203Z STAGE:2022-11-23 01:24:47 5601:5601 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8212536Z STAGE:2022-11-23 01:24:47 5600:5600 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8212868Z STAGE:2022-11-23 01:24:47 5601:5601 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8213207Z STAGE:2022-11-23 01:24:47 5600:5600 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8213556Z STAGE:2022-11-23 01:24:47 5600:5600 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8213893Z STAGE:2022-11-23 01:24:47 5601:5601 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8214292Z STAGE:2022-11-23 01:24:47 5601:5601 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8214395Z ok (5.941s) 2022-11-23T01:43:17.8214415Z 2022-11-23T01:43:17.8214682Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8214797Z Ran 1 test in 5.941s 2022-11-23T01:43:17.8214817Z 2022-11-23T01:43:17.8214911Z OK 2022-11-23T01:43:17.8214930Z 2022-11-23T01:43:17.8215060Z Generating XML reports... 2022-11-23T01:43:17.8215525Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012442.xml 2022-11-23T01:43:17.8215913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8216078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8216467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8216670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8216690Z 2022-11-23T01:43:17.8216799Z Running tests... 2022-11-23T01:43:17.8217066Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8217388Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8217712Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8217947Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5718 2022-11-23T01:43:17.8218154Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5719 2022-11-23T01:43:17.8218541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8218724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8219117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8219320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8219703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8219885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8220276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8220475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8220718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8220971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8221432Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8221846Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8222086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8222324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8222666Z STAGE:2022-11-23 01:24:54 5719:5719 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8223004Z STAGE:2022-11-23 01:24:54 5718:5718 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8223296Z [1669166694.912188] [d8f8c46cdf70:5718 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8223526Z [1669166695.966371] [d8f8c46cdf70:5718 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8223840Z [1669166695.966371] [d8f8c46cdf70:5718 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8224129Z [1669166694.932641] [d8f8c46cdf70:5719 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8224373Z [1669166695.969566] [d8f8c46cdf70:5719 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8224622Z [1669166695.969566] [d8f8c46cdf70:5719 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8225187Z STAGE:2022-11-23 01:24:56 5718:5718 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:56 5719:5719 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8225208Z 2022-11-23T01:43:17.8225565Z STAGE:2022-11-23 01:24:56 5719:5719 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8225922Z STAGE:2022-11-23 01:24:56 5718:5718 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8226245Z STAGE:2022-11-23 01:24:56 5719:5719 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8226702Z STAGE:2022-11-23 01:24:56 5718:5718 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8227058Z STAGE:2022-11-23 01:24:56 5719:5719 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8227629Z STAGE:2022-11-23 01:24:56 5718:5718 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:56 5719:5719 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8227650Z 2022-11-23T01:43:17.8228001Z STAGE:2022-11-23 01:24:56 5718:5718 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8228114Z ok (5.766s) 2022-11-23T01:43:17.8228134Z 2022-11-23T01:43:17.8228402Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8228519Z Ran 1 test in 5.766s 2022-11-23T01:43:17.8228539Z 2022-11-23T01:43:17.8228616Z OK 2022-11-23T01:43:17.8228655Z 2022-11-23T01:43:17.8228764Z Generating XML reports... 2022-11-23T01:43:17.8229234Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012451.xml 2022-11-23T01:43:17.8229621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8229807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8230203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8230403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8230427Z 2022-11-23T01:43:17.8230539Z Running tests... 2022-11-23T01:43:17.8230809Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8231120Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8231396Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8231626Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5832 2022-11-23T01:43:17.8231850Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5833 2022-11-23T01:43:17.8232240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8232422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8232816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8233079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8233449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8233625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8234022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8234219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8234478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8234732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8235452Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8235875Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8236124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8236345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8236716Z [1669166704.134301] [d8f8c46cdf70:5833 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8236972Z [1669166704.141749] [d8f8c46cdf70:5833 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8237225Z [1669166704.141749] [d8f8c46cdf70:5833 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8237513Z [1669166704.127710] [d8f8c46cdf70:5832 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8237762Z [1669166704.133269] [d8f8c46cdf70:5832 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8238017Z [1669166704.133269] [d8f8c46cdf70:5832 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8238123Z ok (5.551s) 2022-11-23T01:43:17.8238145Z 2022-11-23T01:43:17.8238423Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8238540Z Ran 1 test in 5.551s 2022-11-23T01:43:17.8238560Z 2022-11-23T01:43:17.8238636Z OK 2022-11-23T01:43:17.8238655Z 2022-11-23T01:43:17.8238783Z Generating XML reports... 2022-11-23T01:43:17.8239246Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012459.xml 2022-11-23T01:43:17.8239635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8239824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8240258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8240465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8240485Z 2022-11-23T01:43:17.8240596Z Running tests... 2022-11-23T01:43:17.8240850Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8241178Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8241442Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8241667Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5946 2022-11-23T01:43:17.8241892Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5947 2022-11-23T01:43:17.8242276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8242533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8242933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8243136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8243501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8243684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8244075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8244275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8244531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8244791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8245212Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8245672Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8245922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8246251Z STAGE:2022-11-23 01:25:11 5946:5946 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8246491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8246827Z STAGE:2022-11-23 01:25:11 5947:5947 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8247121Z [1669166711.442682] [d8f8c46cdf70:5946 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8247372Z [1669166712.488486] [d8f8c46cdf70:5946 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8247630Z [1669166712.488486] [d8f8c46cdf70:5946 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8247921Z [1669166711.444290] [d8f8c46cdf70:5947 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8248168Z [1669166712.505914] [d8f8c46cdf70:5947 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8248418Z [1669166712.505914] [d8f8c46cdf70:5947 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8248983Z STAGE:2022-11-23 01:25:12 5946:5946 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:25:12 5947:5947 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8249009Z 2022-11-23T01:43:17.8249576Z STAGE:2022-11-23 01:25:12 5947:5947 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:25:12 5946:5946 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8249618Z 2022-11-23T01:43:17.8249938Z STAGE:2022-11-23 01:25:12 5946:5946 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8250274Z STAGE:2022-11-23 01:25:12 5947:5947 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8250616Z STAGE:2022-11-23 01:25:12 5946:5946 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8250950Z STAGE:2022-11-23 01:25:12 5947:5947 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8251305Z STAGE:2022-11-23 01:25:12 5946:5946 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8251715Z STAGE:2022-11-23 01:25:12 5947:5947 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8251821Z ok (5.923s) 2022-11-23T01:43:17.8251841Z 2022-11-23T01:43:17.8252110Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8252223Z Ran 1 test in 5.923s 2022-11-23T01:43:17.8252247Z 2022-11-23T01:43:17.8252325Z OK 2022-11-23T01:43:17.8252344Z 2022-11-23T01:43:17.8252471Z Generating XML reports... 2022-11-23T01:43:17.8252935Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012507.xml 2022-11-23T01:43:17.8253320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8253503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8253897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8254102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8254123Z 2022-11-23T01:43:17.8254233Z Running tests... 2022-11-23T01:43:17.8254484Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8254861Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8255141Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8255368Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6060 2022-11-23T01:43:17.8255591Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6061 2022-11-23T01:43:17.8255981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8256171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8256566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8256769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8257136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8257317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8257709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8257907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8258163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8258419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8258840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8259254Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8259502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8259834Z STAGE:2022-11-23 01:25:19 6060:6060 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8260069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8260400Z STAGE:2022-11-23 01:25:19 6061:6061 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8260697Z [1669166719.876961] [d8f8c46cdf70:6060 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8261005Z [1669166720.909721] [d8f8c46cdf70:6060 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8261263Z [1669166720.909721] [d8f8c46cdf70:6060 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8261555Z [1669166719.898109] [d8f8c46cdf70:6061 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8261797Z [1669166720.942228] [d8f8c46cdf70:6061 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8262049Z [1669166720.942228] [d8f8c46cdf70:6061 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8262613Z STAGE:2022-11-23 01:25:21 6060:6060 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:25:21 6061:6061 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8262637Z 2022-11-23T01:43:17.8262981Z STAGE:2022-11-23 01:25:21 6061:6061 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8263336Z STAGE:2022-11-23 01:25:21 6060:6060 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8263926Z STAGE:2022-11-23 01:25:21 6060:6060 ActivityProfilerController.cpp:300] Completed Stage: Warm UpSTAGE:2022-11-23 01:25:21 6061:6061 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8263949Z 2022-11-23T01:43:17.8264297Z STAGE:2022-11-23 01:25:21 6060:6060 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8264629Z STAGE:2022-11-23 01:25:21 6061:6061 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8264981Z STAGE:2022-11-23 01:25:21 6060:6060 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8265328Z STAGE:2022-11-23 01:25:21 6061:6061 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8265438Z ok (5.915s) 2022-11-23T01:43:17.8265457Z 2022-11-23T01:43:17.8265729Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8265826Z Ran 1 test in 5.915s 2022-11-23T01:43:17.8265864Z 2022-11-23T01:43:17.8265942Z OK 2022-11-23T01:43:17.8265961Z 2022-11-23T01:43:17.8266092Z Generating XML reports... 2022-11-23T01:43:17.8266555Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012515.xml 2022-11-23T01:43:17.8266945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8267131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8267525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8267732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8267752Z 2022-11-23T01:43:17.8267864Z Running tests... 2022-11-23T01:43:17.8268116Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8268437Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8268718Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8268946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6174 2022-11-23T01:43:17.8269171Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6175 2022-11-23T01:43:17.8269557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8269741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8270137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8270397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8270758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8270938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8271331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8271530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8271787Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8272042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8272456Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8272874Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8273099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8273493Z STAGE:2022-11-23 01:25:28 6174:6174 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8273733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8274067Z STAGE:2022-11-23 01:25:28 6175:6175 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8274357Z [1669166728.384262] [d8f8c46cdf70:6174 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8274604Z [1669166729.430684] [d8f8c46cdf70:6174 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8274865Z [1669166729.430684] [d8f8c46cdf70:6174 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8275364Z [1669166728.385878] [d8f8c46cdf70:6175 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8275619Z [1669166729.458401] [d8f8c46cdf70:6175 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8275871Z [1669166729.458401] [d8f8c46cdf70:6175 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8276429Z STAGE:2022-11-23 01:25:29 6174:6174 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:25:29 6175:6175 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8276468Z 2022-11-23T01:43:17.8277039Z STAGE:2022-11-23 01:25:29 6174:6174 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:25:29 6175:6175 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8277082Z 2022-11-23T01:43:17.8277399Z STAGE:2022-11-23 01:25:29 6174:6174 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8277737Z STAGE:2022-11-23 01:25:29 6175:6175 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8278079Z STAGE:2022-11-23 01:25:29 6174:6174 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8278429Z STAGE:2022-11-23 01:25:29 6174:6174 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8278767Z STAGE:2022-11-23 01:25:29 6175:6175 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8279114Z STAGE:2022-11-23 01:25:29 6175:6175 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8279220Z ok (5.956s) 2022-11-23T01:43:17.8279317Z 2022-11-23T01:43:17.8279591Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8279689Z Ran 1 test in 5.956s 2022-11-23T01:43:17.8279727Z 2022-11-23T01:43:17.8279803Z OK 2022-11-23T01:43:17.8279822Z 2022-11-23T01:43:17.8279949Z Generating XML reports... 2022-11-23T01:43:17.8280418Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012524.xml 2022-11-23T01:43:17.8280810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8280993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8281389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8281594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8281614Z 2022-11-23T01:43:17.8281729Z Running tests... 2022-11-23T01:43:17.8281978Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8282304Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8282622Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T01:43:17.8282701Z 2022-11-23T01:43:17.8282976Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8283092Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8283112Z 2022-11-23T01:43:17.8283222Z OK (skipped=1) 2022-11-23T01:43:17.8283242Z 2022-11-23T01:43:17.8283369Z Generating XML reports... 2022-11-23T01:43:17.8283829Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012532.xml 2022-11-23T01:43:17.8284217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8284390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8284789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8284990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8285010Z 2022-11-23T01:43:17.8285127Z Running tests... 2022-11-23T01:43:17.8285397Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8285723Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8286046Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T01:43:17.8286067Z 2022-11-23T01:43:17.8286332Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8286449Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8286473Z 2022-11-23T01:43:17.8286566Z OK (skipped=1) 2022-11-23T01:43:17.8286585Z 2022-11-23T01:43:17.8286712Z Generating XML reports... 2022-11-23T01:43:17.8287174Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012535.xml 2022-11-23T01:43:17.8287563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8287750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8288146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8288349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8288369Z 2022-11-23T01:43:17.8288484Z Running tests... 2022-11-23T01:43:17.8288734Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8289056Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8289446Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T01:43:17.8289466Z 2022-11-23T01:43:17.8289732Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8289847Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8289870Z 2022-11-23T01:43:17.8289981Z OK (skipped=1) 2022-11-23T01:43:17.8290000Z 2022-11-23T01:43:17.8290129Z Generating XML reports... 2022-11-23T01:43:17.8290585Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012537.xml 2022-11-23T01:43:17.8290975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8291142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8291536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8291742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8291761Z 2022-11-23T01:43:17.8291872Z Running tests... 2022-11-23T01:43:17.8292137Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8292519Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8292784Z test_all_to_all (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:43:17.8292804Z 2022-11-23T01:43:17.8293068Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8293182Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8293202Z 2022-11-23T01:43:17.8293293Z OK (skipped=1) 2022-11-23T01:43:17.8293312Z 2022-11-23T01:43:17.8293438Z Generating XML reports... 2022-11-23T01:43:17.8293896Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012540.xml 2022-11-23T01:43:17.8294288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8294471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8294865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8295065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8295085Z 2022-11-23T01:43:17.8295196Z Running tests... 2022-11-23T01:43:17.8295462Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8295767Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8296035Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:43:17.8296059Z 2022-11-23T01:43:17.8296324Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8296438Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8296458Z 2022-11-23T01:43:17.8296565Z OK (skipped=1) 2022-11-23T01:43:17.8296584Z 2022-11-23T01:43:17.8296711Z Generating XML reports... 2022-11-23T01:43:17.8297174Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012542.xml 2022-11-23T01:43:17.8297561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8297728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8298123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8298324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8298344Z 2022-11-23T01:43:17.8298512Z Running tests... 2022-11-23T01:43:17.8298780Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8299101Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8299374Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T01:43:17.8299394Z 2022-11-23T01:43:17.8299660Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8299776Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8299795Z 2022-11-23T01:43:17.8299887Z OK (skipped=1) 2022-11-23T01:43:17.8299906Z 2022-11-23T01:43:17.8300034Z Generating XML reports... 2022-11-23T01:43:17.8300497Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012544.xml 2022-11-23T01:43:17.8300887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8301075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8301472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8301675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8301696Z 2022-11-23T01:43:17.8301810Z Running tests... 2022-11-23T01:43:17.8302127Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8302442Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8302727Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T01:43:17.8302747Z 2022-11-23T01:43:17.8303015Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8303133Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8303153Z 2022-11-23T01:43:17.8303268Z OK (skipped=1) 2022-11-23T01:43:17.8303288Z 2022-11-23T01:43:17.8303414Z Generating XML reports... 2022-11-23T01:43:17.8303876Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012547.xml 2022-11-23T01:43:17.8304262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8304449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8304823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8305026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8305046Z 2022-11-23T01:43:17.8305158Z Running tests... 2022-11-23T01:43:17.8305424Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8305744Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8306022Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:43:17.8306042Z 2022-11-23T01:43:17.8306305Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8306422Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8306442Z 2022-11-23T01:43:17.8306533Z OK (skipped=1) 2022-11-23T01:43:17.8306574Z 2022-11-23T01:43:17.8306685Z Generating XML reports... 2022-11-23T01:43:17.8307148Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012549.xml 2022-11-23T01:43:17.8307536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8307716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8308110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8308378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8308398Z 2022-11-23T01:43:17.8308508Z Running tests... 2022-11-23T01:43:17.8308846Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8309151Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8309482Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T01:43:17.8309503Z 2022-11-23T01:43:17.8309806Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8310081Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8310101Z 2022-11-23T01:43:17.8310251Z OK (skipped=1) 2022-11-23T01:43:17.8310271Z 2022-11-23T01:43:17.8310444Z Generating XML reports... 2022-11-23T01:43:17.8310891Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012552.xml 2022-11-23T01:43:17.8311376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8311594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8312082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8312331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8312352Z 2022-11-23T01:43:17.8312498Z Running tests... 2022-11-23T01:43:17.8312862Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8313224Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8313525Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:43:17.8313546Z 2022-11-23T01:43:17.8313802Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8313955Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8313975Z 2022-11-23T01:43:17.8314120Z OK (skipped=1) 2022-11-23T01:43:17.8314140Z 2022-11-23T01:43:17.8314299Z Generating XML reports... 2022-11-23T01:43:17.8314802Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012554.xml 2022-11-23T01:43:17.8315470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8315743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8316187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8316477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8316498Z 2022-11-23T01:43:17.8316594Z Running tests... 2022-11-23T01:43:17.8316907Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8317266Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8317592Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8317613Z 2022-11-23T01:43:17.8317932Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8318085Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8318105Z 2022-11-23T01:43:17.8318287Z OK (skipped=1) 2022-11-23T01:43:17.8318307Z 2022-11-23T01:43:17.8318470Z Generating XML reports... 2022-11-23T01:43:17.8318913Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012556.xml 2022-11-23T01:43:17.8319338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8319652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8320090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8320335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8320356Z 2022-11-23T01:43:17.8320501Z Running tests... 2022-11-23T01:43:17.8320807Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8321252Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8321645Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8321666Z 2022-11-23T01:43:17.8321918Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8322071Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8322092Z 2022-11-23T01:43:17.8322238Z OK (skipped=1) 2022-11-23T01:43:17.8322263Z 2022-11-23T01:43:17.8322434Z Generating XML reports... 2022-11-23T01:43:17.8322929Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012559.xml 2022-11-23T01:43:17.8323354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8323640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8324120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8324357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8324378Z 2022-11-23T01:43:17.8324472Z Running tests... 2022-11-23T01:43:17.8324786Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8325145Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8325498Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8325519Z 2022-11-23T01:43:17.8325821Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8325972Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8325992Z 2022-11-23T01:43:17.8326139Z OK (skipped=1) 2022-11-23T01:43:17.8326163Z 2022-11-23T01:43:17.8326411Z Generating XML reports... 2022-11-23T01:43:17.8326924Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012601.xml 2022-11-23T01:43:17.8327294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8327516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8327948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8328191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8328212Z 2022-11-23T01:43:17.8328365Z Running tests... 2022-11-23T01:43:17.8328672Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8329031Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8329426Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8329448Z 2022-11-23T01:43:17.8329752Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8329851Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8329870Z 2022-11-23T01:43:17.8330015Z OK (skipped=1) 2022-11-23T01:43:17.8330035Z 2022-11-23T01:43:17.8330193Z Generating XML reports... 2022-11-23T01:43:17.8330690Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012604.xml 2022-11-23T01:43:17.8331177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8331397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8331889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8332165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8332186Z 2022-11-23T01:43:17.8332282Z Running tests... 2022-11-23T01:43:17.8332587Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8332942Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8333298Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8333323Z 2022-11-23T01:43:17.8333622Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8333775Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8333795Z 2022-11-23T01:43:17.8333949Z OK (skipped=1) 2022-11-23T01:43:17.8333969Z 2022-11-23T01:43:17.8334150Z Generating XML reports... 2022-11-23T01:43:17.8334728Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012606.xml 2022-11-23T01:43:17.8335111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8335332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8335757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8335995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8336016Z 2022-11-23T01:43:17.8336176Z Running tests... 2022-11-23T01:43:17.8336484Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8336900Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8337246Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8337271Z 2022-11-23T01:43:17.8337607Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8337708Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8337728Z 2022-11-23T01:43:17.8337877Z OK (skipped=1) 2022-11-23T01:43:17.8337896Z 2022-11-23T01:43:17.8338056Z Generating XML reports... 2022-11-23T01:43:17.8338564Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012608.xml 2022-11-23T01:43:17.8338990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8339212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8339647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8339886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8339907Z 2022-11-23T01:43:17.8340088Z Running tests... 2022-11-23T01:43:17.8340340Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8340695Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8341058Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8341079Z 2022-11-23T01:43:17.8341377Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8341528Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8341599Z 2022-11-23T01:43:17.8341805Z OK (skipped=1) 2022-11-23T01:43:17.8341825Z 2022-11-23T01:43:17.8341989Z Generating XML reports... 2022-11-23T01:43:17.8342488Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012611.xml 2022-11-23T01:43:17.8342950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8343123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8343567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8343802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8343822Z 2022-11-23T01:43:17.8343971Z Running tests... 2022-11-23T01:43:17.8344278Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8344640Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8344985Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8345006Z 2022-11-23T01:43:17.8345311Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8345410Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8345577Z 2022-11-23T01:43:17.8345681Z OK (skipped=1) 2022-11-23T01:43:17.8345700Z 2022-11-23T01:43:17.8345864Z Generating XML reports... 2022-11-23T01:43:17.8346363Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012613.xml 2022-11-23T01:43:17.8346786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8347056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8347486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8347739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8347760Z 2022-11-23T01:43:17.8347905Z Running tests... 2022-11-23T01:43:17.8348156Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8348555Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8348907Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8348928Z 2022-11-23T01:43:17.8349226Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8349378Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8349399Z 2022-11-23T01:43:17.8349544Z OK (skipped=1) 2022-11-23T01:43:17.8349564Z 2022-11-23T01:43:17.8349732Z Generating XML reports... 2022-11-23T01:43:17.8350237Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012616.xml 2022-11-23T01:43:17.8350664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8350830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8351299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8351536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8351557Z 2022-11-23T01:43:17.8351702Z Running tests... 2022-11-23T01:43:17.8352006Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8352429Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8352769Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8352841Z 2022-11-23T01:43:17.8353152Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8353305Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8353325Z 2022-11-23T01:43:17.8353418Z OK (skipped=1) 2022-11-23T01:43:17.8353438Z 2022-11-23T01:43:17.8353637Z Generating XML reports... 2022-11-23T01:43:17.8354141Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012618.xml 2022-11-23T01:43:17.8354567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8354797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8355464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8355708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8355734Z 2022-11-23T01:43:17.8355859Z Running tests... 2022-11-23T01:43:17.8387405Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8387815Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8388261Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8388289Z 2022-11-23T01:43:17.8388566Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8388673Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8388694Z 2022-11-23T01:43:17.8388791Z OK (skipped=1) 2022-11-23T01:43:17.8388811Z 2022-11-23T01:43:17.8388925Z Generating XML reports... 2022-11-23T01:43:17.8389371Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012620.xml 2022-11-23T01:43:17.8389750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8389918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8390298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8390491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8390511Z 2022-11-23T01:43:17.8390611Z Running tests... 2022-11-23T01:43:17.8390868Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8391174Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8391459Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8391487Z 2022-11-23T01:43:17.8391732Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8391839Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8391858Z 2022-11-23T01:43:17.8391955Z OK (skipped=1) 2022-11-23T01:43:17.8391974Z 2022-11-23T01:43:17.8392087Z Generating XML reports... 2022-11-23T01:43:17.8392528Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012623.xml 2022-11-23T01:43:17.8392896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8393064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8393440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8393621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8393647Z 2022-11-23T01:43:17.8393740Z Running tests... 2022-11-23T01:43:17.8393994Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8394389Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8394688Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8394708Z 2022-11-23T01:43:17.8394964Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8395281Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8395305Z 2022-11-23T01:43:17.8395409Z OK (skipped=1) 2022-11-23T01:43:17.8395429Z 2022-11-23T01:43:17.8395543Z Generating XML reports... 2022-11-23T01:43:17.8395990Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012625.xml 2022-11-23T01:43:17.8396355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8396524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8396908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8397091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8397112Z 2022-11-23T01:43:17.8397209Z Running tests... 2022-11-23T01:43:17.8397536Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8397855Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8398149Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8398169Z 2022-11-23T01:43:17.8398416Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8398518Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8398537Z 2022-11-23T01:43:17.8398633Z OK (skipped=1) 2022-11-23T01:43:17.8398657Z 2022-11-23T01:43:17.8398770Z Generating XML reports... 2022-11-23T01:43:17.8399210Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012628.xml 2022-11-23T01:43:17.8399578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8399751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8400128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8400313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8400333Z 2022-11-23T01:43:17.8400425Z Running tests... 2022-11-23T01:43:17.8400678Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8400983Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8401289Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8401309Z 2022-11-23T01:43:17.8401564Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8401668Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8401687Z 2022-11-23T01:43:17.8401784Z OK (skipped=1) 2022-11-23T01:43:17.8401807Z 2022-11-23T01:43:17.8401920Z Generating XML reports... 2022-11-23T01:43:17.8402361Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012630.xml 2022-11-23T01:43:17.8402722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8402891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8403266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8403522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8403542Z 2022-11-23T01:43:17.8403640Z Running tests... 2022-11-23T01:43:17.8403895Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8404200Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8404490Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:43:17.8404510Z 2022-11-23T01:43:17.8404760Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8404857Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8404877Z 2022-11-23T01:43:17.8404974Z OK (skipped=1) 2022-11-23T01:43:17.8404993Z 2022-11-23T01:43:17.8405105Z Generating XML reports... 2022-11-23T01:43:17.8405543Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012632.xml 2022-11-23T01:43:17.8405914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8406081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8406501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8406694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8406715Z 2022-11-23T01:43:17.8406806Z Running tests... 2022-11-23T01:43:17.8407060Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8407362Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8407658Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:43:17.8407683Z 2022-11-23T01:43:17.8407938Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8408040Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8408060Z 2022-11-23T01:43:17.8408158Z OK (skipped=1) 2022-11-23T01:43:17.8408177Z 2022-11-23T01:43:17.8408290Z Generating XML reports... 2022-11-23T01:43:17.8408736Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012635.xml 2022-11-23T01:43:17.8409098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8409265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8409639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8409824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8409844Z 2022-11-23T01:43:17.8409954Z Running tests... 2022-11-23T01:43:17.8410218Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8410533Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8410802Z test_average_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8411027Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7179 2022-11-23T01:43:17.8411231Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7180 2022-11-23T01:43:17.8411605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8411783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8412168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8412425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8412804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8412985Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8413372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8413552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8413803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8414053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8414463Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8414869Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8415106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8415337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8415671Z [1669166803.000804] [d8f8c46cdf70:7180 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8415922Z [1669166803.006837] [d8f8c46cdf70:7180 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8416151Z [1669166803.006837] [d8f8c46cdf70:7180 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8416429Z [1669166802.992276] [d8f8c46cdf70:7179 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8416667Z [1669166802.998301] [d8f8c46cdf70:7179 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8416911Z [1669166802.998301] [d8f8c46cdf70:7179 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8417163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8417410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8417818Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8418219Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8418323Z ok (6.174s) 2022-11-23T01:43:17.8418344Z 2022-11-23T01:43:17.8418611Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8418712Z Ran 1 test in 6.174s 2022-11-23T01:43:17.8418732Z 2022-11-23T01:43:17.8418824Z OK 2022-11-23T01:43:17.8418844Z 2022-11-23T01:43:17.8418967Z Generating XML reports... 2022-11-23T01:43:17.8419421Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012637.xml 2022-11-23T01:43:17.8419806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8419985Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8420371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8420569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8420590Z 2022-11-23T01:43:17.8420699Z Running tests... 2022-11-23T01:43:17.8420951Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8421392Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8421656Z test_backend_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8421877Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7303 2022-11-23T01:43:17.8422101Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7304 2022-11-23T01:43:17.8422481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8422660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8423041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8423219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8423597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8423777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8424161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8424401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8424658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8424905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8425310Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8425713Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8425938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8426168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8426321Z skip: Need at least 3 CUDA devices (4.269s) 2022-11-23T01:43:17.8426341Z 2022-11-23T01:43:17.8426612Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8426727Z Ran 1 test in 4.269s 2022-11-23T01:43:17.8426746Z 2022-11-23T01:43:17.8426854Z OK (skipped=1) 2022-11-23T01:43:17.8426874Z 2022-11-23T01:43:17.8426999Z Generating XML reports... 2022-11-23T01:43:17.8427453Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012646.xml 2022-11-23T01:43:17.8427813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8427989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8428377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8428572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8428592Z 2022-11-23T01:43:17.8428700Z Running tests... 2022-11-23T01:43:17.8428968Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8429284Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8429537Z test_backend_group (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 3 (0.002s) 2022-11-23T01:43:17.8429557Z 2022-11-23T01:43:17.8429821Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8429917Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8429936Z 2022-11-23T01:43:17.8430044Z OK (skipped=1) 2022-11-23T01:43:17.8430063Z 2022-11-23T01:43:17.8430185Z Generating XML reports... 2022-11-23T01:43:17.8430756Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012653.xml 2022-11-23T01:43:17.8431132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8431310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8431702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8431897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8431917Z 2022-11-23T01:43:17.8432026Z Running tests... 2022-11-23T01:43:17.8432273Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8432586Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8432836Z test_barrier (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T01:43:17.8432859Z 2022-11-23T01:43:17.8433123Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8433235Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8433255Z 2022-11-23T01:43:17.8433364Z OK (skipped=1) 2022-11-23T01:43:17.8433383Z 2022-11-23T01:43:17.8433507Z Generating XML reports... 2022-11-23T01:43:17.8434006Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012655.xml 2022-11-23T01:43:17.8434379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8434557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8434943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8435389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8435417Z 2022-11-23T01:43:17.8435530Z Running tests... 2022-11-23T01:43:17.8435800Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8436117Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8436377Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8436600Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7472 2022-11-23T01:43:17.8436801Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7473 2022-11-23T01:43:17.8437186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8437365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8437749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8437949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8438323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8438500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8438887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8439065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8439311Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8439555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8439962Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8440457Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8440692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8440922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8441210Z [1669166822.894761] [d8f8c46cdf70:7472 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8441457Z [1669166822.901061] [d8f8c46cdf70:7472 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8441707Z [1669166822.901061] [d8f8c46cdf70:7472 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8441970Z [1669166822.901312] [d8f8c46cdf70:7473 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8442210Z [1669166822.906272] [d8f8c46cdf70:7473 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8442453Z [1669166822.906272] [d8f8c46cdf70:7473 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8442557Z ok (6.369s) 2022-11-23T01:43:17.8442638Z 2022-11-23T01:43:17.8442921Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8443037Z Ran 1 test in 6.369s 2022-11-23T01:43:17.8443057Z 2022-11-23T01:43:17.8443150Z OK 2022-11-23T01:43:17.8443169Z 2022-11-23T01:43:17.8443293Z Generating XML reports... 2022-11-23T01:43:17.8443745Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012658.xml 2022-11-23T01:43:17.8444107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8444291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8444679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8444875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8444896Z 2022-11-23T01:43:17.8445008Z Running tests... 2022-11-23T01:43:17.8445277Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8445596Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8445863Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T01:43:17.8445883Z 2022-11-23T01:43:17.8446147Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8446244Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8446263Z 2022-11-23T01:43:17.8446375Z OK (skipped=1) 2022-11-23T01:43:17.8446394Z 2022-11-23T01:43:17.8446520Z Generating XML reports... 2022-11-23T01:43:17.8446973Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012707.xml 2022-11-23T01:43:17.8447351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8447532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8447921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8448116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8448136Z 2022-11-23T01:43:17.8448228Z Running tests... 2022-11-23T01:43:17.8448493Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8448808Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8449143Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8449364Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7619 2022-11-23T01:43:17.8449585Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7620 2022-11-23T01:43:17.8449966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8450145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8450530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8450708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8451083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8451262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8451645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8451837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8452132Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8452391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8452796Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8453182Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8453416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8453655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8453815Z skip: Skipped due to small world size. (4.204s) 2022-11-23T01:43:17.8453836Z 2022-11-23T01:43:17.8454104Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8454218Z Ran 1 test in 4.204s 2022-11-23T01:43:17.8454238Z 2022-11-23T01:43:17.8454350Z OK (skipped=1) 2022-11-23T01:43:17.8454369Z 2022-11-23T01:43:17.8454495Z Generating XML reports... 2022-11-23T01:43:17.8454947Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012709.xml 2022-11-23T01:43:17.8455309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8455487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8455874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8456073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8456092Z 2022-11-23T01:43:17.8456202Z Running tests... 2022-11-23T01:43:17.8456465Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8456786Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8457051Z test_barrier_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T01:43:17.8457071Z 2022-11-23T01:43:17.8457331Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8457428Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8457447Z 2022-11-23T01:43:17.8457555Z OK (skipped=1) 2022-11-23T01:43:17.8457575Z 2022-11-23T01:43:17.8457690Z Generating XML reports... 2022-11-23T01:43:17.8458135Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012716.xml 2022-11-23T01:43:17.8458577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8458754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8459141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8459336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8459357Z 2022-11-23T01:43:17.8459464Z Running tests... 2022-11-23T01:43:17.8459711Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8460024Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8460291Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8460511Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7755 2022-11-23T01:43:17.8460732Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7756 2022-11-23T01:43:17.8461106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8461281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8461712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8461898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8462273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8462450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8462837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8463035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8463283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8463530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8463941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8464346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8464565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8464798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8464960Z skip: Skipped due to small world size. (4.235s) 2022-11-23T01:43:17.8464984Z 2022-11-23T01:43:17.8465254Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8465366Z Ran 1 test in 4.235s 2022-11-23T01:43:17.8465386Z 2022-11-23T01:43:17.8465494Z OK (skipped=1) 2022-11-23T01:43:17.8465513Z 2022-11-23T01:43:17.8465636Z Generating XML reports... 2022-11-23T01:43:17.8466093Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012718.xml 2022-11-23T01:43:17.8466454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8466633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8467019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8467212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8467232Z 2022-11-23T01:43:17.8467401Z Running tests... 2022-11-23T01:43:17.8467670Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8467985Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8468268Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T01:43:17.8468289Z 2022-11-23T01:43:17.8468555Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8468652Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8468671Z 2022-11-23T01:43:17.8468780Z OK (skipped=1) 2022-11-23T01:43:17.8468800Z 2022-11-23T01:43:17.8468923Z Generating XML reports... 2022-11-23T01:43:17.8469374Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012725.xml 2022-11-23T01:43:17.8469753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8469934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8470320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8470514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8470534Z 2022-11-23T01:43:17.8470644Z Running tests... 2022-11-23T01:43:17.8470948Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8471273Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8471552Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T01:43:17.8471572Z 2022-11-23T01:43:17.8471837Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8471952Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8471972Z 2022-11-23T01:43:17.8472087Z OK (skipped=1) 2022-11-23T01:43:17.8472106Z 2022-11-23T01:43:17.8472230Z Generating XML reports... 2022-11-23T01:43:17.8472682Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012727.xml 2022-11-23T01:43:17.8473058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8473222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8473605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8473799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8473820Z 2022-11-23T01:43:17.8473929Z Running tests... 2022-11-23T01:43:17.8474193Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8474507Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8474789Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T01:43:17.8474809Z 2022-11-23T01:43:17.8475289Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8475394Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8475433Z 2022-11-23T01:43:17.8475525Z OK (skipped=1) 2022-11-23T01:43:17.8475549Z 2022-11-23T01:43:17.8475676Z Generating XML reports... 2022-11-23T01:43:17.8476136Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012730.xml 2022-11-23T01:43:17.8476513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8476692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8477076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8477362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8477382Z 2022-11-23T01:43:17.8477492Z Running tests... 2022-11-23T01:43:17.8477743Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8478057Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8478323Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T01:43:17.8478344Z 2022-11-23T01:43:17.8478606Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8478718Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8478738Z 2022-11-23T01:43:17.8478847Z OK (skipped=1) 2022-11-23T01:43:17.8478867Z 2022-11-23T01:43:17.8478989Z Generating XML reports... 2022-11-23T01:43:17.8479439Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012732.xml 2022-11-23T01:43:17.8479818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8479980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8480363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8480618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8480641Z 2022-11-23T01:43:17.8480758Z Running tests... 2022-11-23T01:43:17.8481023Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8481339Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8481607Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T01:43:17.8481627Z 2022-11-23T01:43:17.8481894Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8481990Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8482026Z 2022-11-23T01:43:17.8482117Z OK (skipped=1) 2022-11-23T01:43:17.8482136Z 2022-11-23T01:43:17.8482259Z Generating XML reports... 2022-11-23T01:43:17.8482711Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012735.xml 2022-11-23T01:43:17.8483089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8483266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8483654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8483849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8483869Z 2022-11-23T01:43:17.8483980Z Running tests... 2022-11-23T01:43:17.8484235Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8484547Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8484824Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:43:17.8484844Z 2022-11-23T01:43:17.8485105Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8485220Z Ran 1 test in 0.003s 2022-11-23T01:43:17.8485240Z 2022-11-23T01:43:17.8485347Z OK (skipped=1) 2022-11-23T01:43:17.8485366Z 2022-11-23T01:43:17.8485490Z Generating XML reports... 2022-11-23T01:43:17.8485938Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012737.xml 2022-11-23T01:43:17.8486312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8486545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8486933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8487127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8487148Z 2022-11-23T01:43:17.8487258Z Running tests... 2022-11-23T01:43:17.8487524Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8487836Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8488093Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T01:43:17.8488113Z 2022-11-23T01:43:17.8488376Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8488488Z Ran 1 test in 0.003s 2022-11-23T01:43:17.8488508Z 2022-11-23T01:43:17.8488600Z OK (skipped=1) 2022-11-23T01:43:17.8488623Z 2022-11-23T01:43:17.8488747Z Generating XML reports... 2022-11-23T01:43:17.8489196Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012739.xml 2022-11-23T01:43:17.8489577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8489805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8490201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8490395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8490415Z 2022-11-23T01:43:17.8490523Z Running tests... 2022-11-23T01:43:17.8490772Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8491085Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8491367Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T01:43:17.8491388Z 2022-11-23T01:43:17.8491646Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8491759Z Ran 1 test in 0.003s 2022-11-23T01:43:17.8491779Z 2022-11-23T01:43:17.8491886Z OK (skipped=1) 2022-11-23T01:43:17.8491905Z 2022-11-23T01:43:17.8492035Z Generating XML reports... 2022-11-23T01:43:17.8492481Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012742.xml 2022-11-23T01:43:17.8492858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8493019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8493403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8493602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8493622Z 2022-11-23T01:43:17.8493731Z Running tests... 2022-11-23T01:43:17.8493992Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8494304Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8494571Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:43:17.8494592Z 2022-11-23T01:43:17.8494859Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8494972Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8494992Z 2022-11-23T01:43:17.8495083Z OK (skipped=1) 2022-11-23T01:43:17.8495102Z 2022-11-23T01:43:17.8495224Z Generating XML reports... 2022-11-23T01:43:17.8495674Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012744.xml 2022-11-23T01:43:17.8496114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8496295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8496679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8496878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8496898Z 2022-11-23T01:43:17.8497009Z Running tests... 2022-11-23T01:43:17.8497256Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8497570Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8497840Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:43:17.8497860Z 2022-11-23T01:43:17.8498121Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8498240Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8498260Z 2022-11-23T01:43:17.8498368Z OK (skipped=1) 2022-11-23T01:43:17.8498388Z 2022-11-23T01:43:17.8498510Z Generating XML reports... 2022-11-23T01:43:17.8498958Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012747.xml 2022-11-23T01:43:17.8499387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8499559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8499946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8500140Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8500160Z 2022-11-23T01:43:17.8500268Z Running tests... 2022-11-23T01:43:17.8500530Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8500848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8501126Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:43:17.8501146Z 2022-11-23T01:43:17.8501406Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8501522Z Ran 1 test in 0.003s 2022-11-23T01:43:17.8501541Z 2022-11-23T01:43:17.8501633Z OK (skipped=1) 2022-11-23T01:43:17.8501652Z 2022-11-23T01:43:17.8501775Z Generating XML reports... 2022-11-23T01:43:17.8502219Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012749.xml 2022-11-23T01:43:17.8502595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8502772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8503161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8503355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8503374Z 2022-11-23T01:43:17.8503482Z Running tests... 2022-11-23T01:43:17.8503751Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8504051Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8504317Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:43:17.8504337Z 2022-11-23T01:43:17.8504599Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8504712Z Ran 1 test in 0.003s 2022-11-23T01:43:17.8504731Z 2022-11-23T01:43:17.8504840Z OK (skipped=1) 2022-11-23T01:43:17.8504860Z 2022-11-23T01:43:17.8504983Z Generating XML reports... 2022-11-23T01:43:17.8505496Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012751.xml 2022-11-23T01:43:17.8505873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8506033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8506423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8506621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8506642Z 2022-11-23T01:43:17.8506749Z Running tests... 2022-11-23T01:43:17.8507011Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8507326Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8507593Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:43:17.8507617Z 2022-11-23T01:43:17.8507879Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8507991Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8508010Z 2022-11-23T01:43:17.8508101Z OK (skipped=1) 2022-11-23T01:43:17.8508136Z 2022-11-23T01:43:17.8508243Z Generating XML reports... 2022-11-23T01:43:17.8508740Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012754.xml 2022-11-23T01:43:17.8509128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8509306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8509689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8509883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8509907Z 2022-11-23T01:43:17.8510016Z Running tests... 2022-11-23T01:43:17.8510281Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8510576Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8510832Z test_broadcast (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8511055Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8287 2022-11-23T01:43:17.8511273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8288 2022-11-23T01:43:17.8511652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8511828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8512215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8512413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8512773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8512951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8513338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8513530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8513777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8514027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8514433Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8514899Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8515345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8515683Z STAGE:2022-11-23 01:28:00 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8515918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8516249Z STAGE:2022-11-23 01:28:00 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8516534Z [1669166880.598734] [d8f8c46cdf70:8288 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8516774Z [1669166881.657764] [d8f8c46cdf70:8288 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8517027Z [1669166881.657764] [d8f8c46cdf70:8288 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8517307Z [1669166880.596225] [d8f8c46cdf70:8287 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8517617Z [1669166881.661150] [d8f8c46cdf70:8287 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8517874Z [1669166881.661150] [d8f8c46cdf70:8287 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8518431Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8518452Z 2022-11-23T01:43:17.8518809Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8519150Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8519483Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8519812Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8520149Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8520497Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8520830Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8521210Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8521541Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8521856Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8522192Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8522522Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8522867Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8523210Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8523535Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8523859Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8524190Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8524595Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8524920Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8525270Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8525597Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8525920Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8526248Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8526576Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8526917Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8527263Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8527585Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8527940Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8528280Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8528604Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8528944Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8529284Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8529605Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8529934Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8530477Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8530502Z 2022-11-23T01:43:17.8531072Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8531093Z 2022-11-23T01:43:17.8531423Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8531736Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8532075Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8532412Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8532761Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8533111Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8533436Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8533762Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8534096Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8534425Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8534813Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8535161Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8535487Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8535814Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8536145Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8536473Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8536816Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8537161Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8537492Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8537799Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8538127Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8538501Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8538855Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8539194Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8539520Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8539840Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8540218Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8540549Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8540871Z STAGE:2022-11-23 01:28:02 8287:8287 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8541215Z STAGE:2022-11-23 01:28:02 8288:8288 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8541319Z ok (5.943s) 2022-11-23T01:43:17.8541338Z 2022-11-23T01:43:17.8541607Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8541721Z Ran 1 test in 5.943s 2022-11-23T01:43:17.8541740Z 2022-11-23T01:43:17.8541835Z OK 2022-11-23T01:43:17.8541854Z 2022-11-23T01:43:17.8541978Z Generating XML reports... 2022-11-23T01:43:17.8542430Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012756.xml 2022-11-23T01:43:17.8542797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8542978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8543364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8543564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8543584Z 2022-11-23T01:43:17.8543692Z Running tests... 2022-11-23T01:43:17.8543959Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8544273Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8544562Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and Nccl backend supports CUDA allReduce (0.002s) 2022-11-23T01:43:17.8544582Z 2022-11-23T01:43:17.8544907Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8545004Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8545024Z 2022-11-23T01:43:17.8545133Z OK (skipped=1) 2022-11-23T01:43:17.8545153Z 2022-11-23T01:43:17.8545276Z Generating XML reports... 2022-11-23T01:43:17.8545726Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012805.xml 2022-11-23T01:43:17.8546106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8546286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8546672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8546867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8546887Z 2022-11-23T01:43:17.8546995Z Running tests... 2022-11-23T01:43:17.8547251Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8547567Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8547836Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8548058Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8434 2022-11-23T01:43:17.8548330Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8435 2022-11-23T01:43:17.8548720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8548898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8549279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8549457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8549841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8550018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8550401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8550596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8550843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8551090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8551495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8551896Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8552119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8552363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8552588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8552829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8553230Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8553568Z STAGE:2022-11-23 01:28:11 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8553969Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8554303Z STAGE:2022-11-23 01:28:11 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8554664Z [1669166891.577309] [d8f8c46cdf70:8434 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8554891Z [1669166892.613723] [d8f8c46cdf70:8434 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8555355Z [1669166892.613723] [d8f8c46cdf70:8434 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8555717Z STAGE:2022-11-23 01:28:12 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8555998Z [1669166891.600091] [d8f8c46cdf70:8435 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8556235Z [1669166892.624458] [d8f8c46cdf70:8435 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8556483Z [1669166892.624458] [d8f8c46cdf70:8435 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8556824Z STAGE:2022-11-23 01:28:12 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8557252Z STAGE:2022-11-23 01:28:12 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8557617Z STAGE:2022-11-23 01:28:12 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8557950Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8558260Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8558593Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8558919Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8559269Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8559613Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8559945Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8560274Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8560824Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8560847Z 2022-11-23T01:43:17.8561415Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8561439Z 2022-11-23T01:43:17.8561769Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8562080Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8562420Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8562749Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8563095Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8563443Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8563770Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8564175Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8564507Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8564834Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8565163Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8565510Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8565837Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8566163Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8566492Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8566822Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8567167Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8567509Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8567880Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8568197Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8568527Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8568854Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8569198Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8569550Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8569875Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8570198Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8570528Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8570873Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8571192Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8571537Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8571859Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8572188Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8572517Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8572843Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8573188Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8573531Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8573839Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8574167Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8574495Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8574878Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8575218Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8575559Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8575887Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8576210Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8576536Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8576842Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8577182Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8577529Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8577852Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8578175Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8578600Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8578939Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8579283Z STAGE:2022-11-23 01:28:13 8434:8434 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8579623Z STAGE:2022-11-23 01:28:13 8435:8435 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8579711Z ok (5.863s) 2022-11-23T01:43:17.8579736Z 2022-11-23T01:43:17.8580008Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8580121Z Ran 1 test in 5.863s 2022-11-23T01:43:17.8580140Z 2022-11-23T01:43:17.8580233Z OK 2022-11-23T01:43:17.8580253Z 2022-11-23T01:43:17.8580377Z Generating XML reports... 2022-11-23T01:43:17.8580837Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012807.xml 2022-11-23T01:43:17.8581217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8581396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8581766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8581962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8581982Z 2022-11-23T01:43:17.8582090Z Running tests... 2022-11-23T01:43:17.8582364Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8582680Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8582942Z test_broadcast_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8583167Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8548 2022-11-23T01:43:17.8583386Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8549 2022-11-23T01:43:17.8583764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8583927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8584312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8584507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8584945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8585124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8585508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8585705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8585957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8586187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8586593Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8586996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8587235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8587464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8587623Z skip: Skipped due to small world size. (4.350s) 2022-11-23T01:43:17.8587644Z 2022-11-23T01:43:17.8587960Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8588083Z Ran 1 test in 4.351s 2022-11-23T01:43:17.8588103Z 2022-11-23T01:43:17.8588213Z OK (skipped=1) 2022-11-23T01:43:17.8588233Z 2022-11-23T01:43:17.8588341Z Generating XML reports... 2022-11-23T01:43:17.8588797Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012816.xml 2022-11-23T01:43:17.8589174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8589358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8589743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8589939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8589960Z 2022-11-23T01:43:17.8590069Z Running tests... 2022-11-23T01:43:17.8590338Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8590655Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8590907Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8591126Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8651 2022-11-23T01:43:17.8591344Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8652 2022-11-23T01:43:17.8591717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8591900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8592283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8592480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8592856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8593018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8593399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8593591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8593841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8594146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8594554Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8594964Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8595576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8595823Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8596623Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:43:17.8596729Z warnings.warn( 2022-11-23T01:43:17.8597498Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:43:17.8597614Z warnings.warn( 2022-11-23T01:43:17.8597985Z [1669166907.701612] [d8f8c46cdf70:8651 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8598243Z [1669166907.707814] [d8f8c46cdf70:8651 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8598490Z [1669166907.707814] [d8f8c46cdf70:8651 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8598769Z [1669166907.709938] [d8f8c46cdf70:8652 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8599013Z [1669166907.715213] [d8f8c46cdf70:8652 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8599255Z [1669166907.715213] [d8f8c46cdf70:8652 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8599363Z ok (5.562s) 2022-11-23T01:43:17.8599385Z 2022-11-23T01:43:17.8599643Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8599758Z Ran 1 test in 5.562s 2022-11-23T01:43:17.8599778Z 2022-11-23T01:43:17.8599875Z OK 2022-11-23T01:43:17.8599895Z 2022-11-23T01:43:17.8600021Z Generating XML reports... 2022-11-23T01:43:17.8600475Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012822.xml 2022-11-23T01:43:17.8600858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8601043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8601434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8601616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8601657Z 2022-11-23T01:43:17.8601751Z Running tests... 2022-11-23T01:43:17.8602022Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8602342Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8602613Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8603370Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82847 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.642s) 2022-11-23T01:43:17.8603459Z 2022-11-23T01:43:17.8603736Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8603853Z Ran 1 test in 1.642s 2022-11-23T01:43:17.8603873Z 2022-11-23T01:43:17.8603982Z OK (skipped=1) 2022-11-23T01:43:17.8604002Z 2022-11-23T01:43:17.8604132Z Generating XML reports... 2022-11-23T01:43:17.8604571Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012831.xml 2022-11-23T01:43:17.8604951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8605130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8605519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8605719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8605739Z 2022-11-23T01:43:17.8605848Z Running tests... 2022-11-23T01:43:17.8606115Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8606434Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8606804Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8607545Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85012 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.656s) 2022-11-23T01:43:17.8607586Z 2022-11-23T01:43:17.8607833Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8607955Z Ran 1 test in 1.657s 2022-11-23T01:43:17.8607975Z 2022-11-23T01:43:17.8608086Z OK (skipped=1) 2022-11-23T01:43:17.8608106Z 2022-11-23T01:43:17.8608231Z Generating XML reports... 2022-11-23T01:43:17.8608680Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012835.xml 2022-11-23T01:43:17.8609063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8609245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8609633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8609813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8609850Z 2022-11-23T01:43:17.8609943Z Running tests... 2022-11-23T01:43:17.8610208Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8610524Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8610849Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8611606Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85339 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.637s) 2022-11-23T01:43:17.8611628Z 2022-11-23T01:43:17.8611896Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8612014Z Ran 1 test in 1.637s 2022-11-23T01:43:17.8612034Z 2022-11-23T01:43:17.8612142Z OK (skipped=1) 2022-11-23T01:43:17.8612162Z 2022-11-23T01:43:17.8612287Z Generating XML reports... 2022-11-23T01:43:17.8612781Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012839.xml 2022-11-23T01:43:17.8613162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8613344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8613740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8613937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8613958Z 2022-11-23T01:43:17.8614066Z Running tests... 2022-11-23T01:43:17.8614335Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8614651Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8614922Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8615130Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8867 2022-11-23T01:43:17.8615351Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8868 2022-11-23T01:43:17.8615734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8615964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8616361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8616559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8616937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8617116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8617486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8617688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8617937Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8618189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8618599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8619006Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8619246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8619480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8619737Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2j0ksrer 2022-11-23T01:43:17.8620001Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2j0ksrer/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8620257Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb6vrf5hw 2022-11-23T01:43:17.8620530Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb6vrf5hw/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8620813Z [1669166928.358536] [d8f8c46cdf70:8867 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8621054Z [1669166928.366146] [d8f8c46cdf70:8867 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8621344Z [1669166928.366146] [d8f8c46cdf70:8867 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8621588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8621946Z [1669166928.366121] [d8f8c46cdf70:8868 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8622181Z [1669166928.371290] [d8f8c46cdf70:8868 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8622427Z [1669166928.371290] [d8f8c46cdf70:8868 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8622650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8622755Z ok (5.919s) 2022-11-23T01:43:17.8622776Z 2022-11-23T01:43:17.8623056Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8623173Z Ran 1 test in 5.920s 2022-11-23T01:43:17.8623192Z 2022-11-23T01:43:17.8623286Z OK 2022-11-23T01:43:17.8623306Z 2022-11-23T01:43:17.8623436Z Generating XML reports... 2022-11-23T01:43:17.8623894Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012843.xml 2022-11-23T01:43:17.8624279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8624493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8624894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8625091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8625112Z 2022-11-23T01:43:17.8625223Z Running tests... 2022-11-23T01:43:17.8625490Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8625809Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8626092Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8626318Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8985 2022-11-23T01:43:17.8626538Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8986 2022-11-23T01:43:17.8626903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8627082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8627470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8627667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8628046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8628223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8628611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8628805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8629036Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8629287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8629695Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8630104Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8630338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8630569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8630897Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg3dtb7r0 2022-11-23T01:43:17.8631173Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg3dtb7r0/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8631431Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe94xuiz8 2022-11-23T01:43:17.8631688Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe94xuiz8/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8631971Z [1669166936.940132] [d8f8c46cdf70:8985 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8632213Z [1669166936.946150] [d8f8c46cdf70:8985 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8632462Z [1669166936.946150] [d8f8c46cdf70:8985 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8632747Z [1669166936.940192] [d8f8c46cdf70:8986 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8632987Z [1669166936.946453] [d8f8c46cdf70:8986 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8633279Z [1669166936.946453] [d8f8c46cdf70:8986 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8633533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8633773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8634010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8634229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8634339Z ok (6.077s) 2022-11-23T01:43:17.8634361Z 2022-11-23T01:43:17.8634641Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8634758Z Ran 1 test in 6.077s 2022-11-23T01:43:17.8634778Z 2022-11-23T01:43:17.8634872Z OK 2022-11-23T01:43:17.8634892Z 2022-11-23T01:43:17.8635191Z Generating XML reports... 2022-11-23T01:43:17.8635673Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012852.xml 2022-11-23T01:43:17.8636061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8636225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8636616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8636815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8636840Z 2022-11-23T01:43:17.8636952Z Running tests... 2022-11-23T01:43:17.8637225Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8637543Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8637820Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8638583Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78641 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.621s) 2022-11-23T01:43:17.8638605Z 2022-11-23T01:43:17.8638872Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8638988Z Ran 1 test in 1.621s 2022-11-23T01:43:17.8639008Z 2022-11-23T01:43:17.8639099Z OK (skipped=1) 2022-11-23T01:43:17.8639201Z 2022-11-23T01:43:17.8639336Z Generating XML reports... 2022-11-23T01:43:17.8639787Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012900.xml 2022-11-23T01:43:17.8640165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8640351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8640739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8640935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8640955Z 2022-11-23T01:43:17.8641064Z Running tests... 2022-11-23T01:43:17.8641314Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8641633Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8641934Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8642750Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77261 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.641s) 2022-11-23T01:43:17.8642774Z 2022-11-23T01:43:17.8643049Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8643165Z Ran 1 test in 1.642s 2022-11-23T01:43:17.8643185Z 2022-11-23T01:43:17.8643294Z OK (skipped=1) 2022-11-23T01:43:17.8643314Z 2022-11-23T01:43:17.8643440Z Generating XML reports... 2022-11-23T01:43:17.8643888Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012904.xml 2022-11-23T01:43:17.8644264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8644434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8644822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8645016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8645040Z 2022-11-23T01:43:17.8645152Z Running tests... 2022-11-23T01:43:17.8645415Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8645728Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8646021Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8646242Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9171 2022-11-23T01:43:17.8646462Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9172 2022-11-23T01:43:17.8646831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8647012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8647408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8647607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8647987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8648165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8648549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8648744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8649040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8649291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8649700Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8650109Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8650342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8650573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8650834Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq9zhzf18 2022-11-23T01:43:17.8651108Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq9zhzf18/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8651370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpttlv8jlk 2022-11-23T01:43:17.8651625Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpttlv8jlk/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8651961Z [1669166953.781245] [d8f8c46cdf70:9172 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8652215Z [1669166953.788004] [d8f8c46cdf70:9172 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8652462Z [1669166953.788004] [d8f8c46cdf70:9172 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8652686Z 2022-11-23T01:43:17.8652965Z [1669166953.776024] [d8f8c46cdf70:9171 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8653204Z [1669166953.783209] [d8f8c46cdf70:9171 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8653444Z [1669166953.783209] [d8f8c46cdf70:9171 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8653547Z ok (5.565s) 2022-11-23T01:43:17.8653571Z 2022-11-23T01:43:17.8653843Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8653941Z Ran 1 test in 5.566s 2022-11-23T01:43:17.8653960Z 2022-11-23T01:43:17.8654053Z OK 2022-11-23T01:43:17.8654073Z 2022-11-23T01:43:17.8654197Z Generating XML reports... 2022-11-23T01:43:17.8654652Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012909.xml 2022-11-23T01:43:17.8655034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8655219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8655605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8655801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8655822Z 2022-11-23T01:43:17.8655915Z Running tests... 2022-11-23T01:43:17.8656190Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8656513Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8656828Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8657050Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9285 2022-11-23T01:43:17.8657270Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9286 2022-11-23T01:43:17.8657714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8657894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8658282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8658463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8658840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8659019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8659406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8659601Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8659849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8660099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8660510Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8660950Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8661195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8661424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8661681Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1__xi7bm 2022-11-23T01:43:17.8661953Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1__xi7bm/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8662216Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaptofwx1 2022-11-23T01:43:17.8662490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaptofwx1/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8662777Z [1669166961.893880] [d8f8c46cdf70:9286 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8663023Z [1669166961.899969] [d8f8c46cdf70:9286 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8663271Z [1669166961.899969] [d8f8c46cdf70:9286 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8663536Z [1669166961.890829] [d8f8c46cdf70:9285 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8663775Z [1669166961.897438] [d8f8c46cdf70:9285 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8664024Z [1669166961.897438] [d8f8c46cdf70:9285 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8664132Z ok (5.555s) 2022-11-23T01:43:17.8664152Z 2022-11-23T01:43:17.8664429Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8664549Z Ran 1 test in 5.555s 2022-11-23T01:43:17.8664569Z 2022-11-23T01:43:17.8664663Z OK 2022-11-23T01:43:17.8664682Z 2022-11-23T01:43:17.8664810Z Generating XML reports... 2022-11-23T01:43:17.8665265Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012917.xml 2022-11-23T01:43:17.8665627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8665807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8666256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8666452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8666473Z 2022-11-23T01:43:17.8666584Z Running tests... 2022-11-23T01:43:17.8666855Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8667177Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8667448Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8667655Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9399 2022-11-23T01:43:17.8667874Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9400 2022-11-23T01:43:17.8668251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8668434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8668821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8669018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8669447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8669636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8670026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8670203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8670453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8670704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8671116Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8671523Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8671762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8671993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8672253Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkislb3ga 2022-11-23T01:43:17.8672534Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkislb3ga/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8672777Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4t04c9ow 2022-11-23T01:43:17.8673047Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4t04c9ow/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8673289Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8673529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8673818Z [1669166970.073043] [d8f8c46cdf70:9400 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8674058Z [1669166970.078461] [d8f8c46cdf70:9400 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8674304Z [1669166970.078461] [d8f8c46cdf70:9400 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8674583Z [1669166970.065618] [d8f8c46cdf70:9399 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8674883Z [1669166970.073214] [d8f8c46cdf70:9399 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8675337Z [1669166970.073214] [d8f8c46cdf70:9399 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8675453Z ok (6.063s) 2022-11-23T01:43:17.8675475Z 2022-11-23T01:43:17.8675762Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8675880Z Ran 1 test in 6.063s 2022-11-23T01:43:17.8675900Z 2022-11-23T01:43:17.8675996Z OK 2022-11-23T01:43:17.8676016Z 2022-11-23T01:43:17.8676142Z Generating XML reports... 2022-11-23T01:43:17.8676596Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012925.xml 2022-11-23T01:43:17.8676979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8677166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8677540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8677738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8677759Z 2022-11-23T01:43:17.8677868Z Running tests... 2022-11-23T01:43:17.8678217Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8678549Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8678846Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8679065Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9517 2022-11-23T01:43:17.8679283Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9518 2022-11-23T01:43:17.8679642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8679831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8680218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8680419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8680795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8680974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8681361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8681555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8681801Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8682037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8682446Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8682862Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8683099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8683330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8683589Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo8ut8nbc 2022-11-23T01:43:17.8683861Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo8ut8nbc/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8684118Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp2g46uq4 2022-11-23T01:43:17.8684478Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp2g46uq4/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8684744Z [1669166978.654985] [d8f8c46cdf70:9518 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8684989Z [1669166978.660572] [d8f8c46cdf70:9518 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8685237Z [1669166978.660572] [d8f8c46cdf70:9518 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8685517Z [1669166978.648385] [d8f8c46cdf70:9517 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8685755Z [1669166978.654020] [d8f8c46cdf70:9517 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8686004Z [1669166978.654020] [d8f8c46cdf70:9517 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8686830Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.8686947Z ok (6.071s) 2022-11-23T01:43:17.8686968Z 2022-11-23T01:43:17.8687246Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8687364Z Ran 1 test in 6.071s 2022-11-23T01:43:17.8687384Z 2022-11-23T01:43:17.8687478Z OK 2022-11-23T01:43:17.8687501Z 2022-11-23T01:43:17.8687610Z Generating XML reports... 2022-11-23T01:43:17.8688070Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012933.xml 2022-11-23T01:43:17.8688455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8688642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8689030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8689226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8689246Z 2022-11-23T01:43:17.8689356Z Running tests... 2022-11-23T01:43:17.8689623Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8689941Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8690214Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8690976Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78235 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T01:43:17.8690998Z 2022-11-23T01:43:17.8691267Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8691383Z Ran 1 test in 1.647s 2022-11-23T01:43:17.8691403Z 2022-11-23T01:43:17.8691513Z OK (skipped=1) 2022-11-23T01:43:17.8691532Z 2022-11-23T01:43:17.8691659Z Generating XML reports... 2022-11-23T01:43:17.8692116Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012942.xml 2022-11-23T01:43:17.8692498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8692740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8693118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8693320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8693341Z 2022-11-23T01:43:17.8693452Z Running tests... 2022-11-23T01:43:17.8693721Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8694039Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8694302Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8694525Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9669 2022-11-23T01:43:17.8694743Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9670 2022-11-23T01:43:17.8695124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8695289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8695720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8695926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8696305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8696484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8696869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8697064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8697320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8697556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8697967Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8698377Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8698616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8698879Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5xgj4xc0 2022-11-23T01:43:17.8699153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5xgj4xc0/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8699385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8699650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplwz6kqjm 2022-11-23T01:43:17.8699922Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplwz6kqjm/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8700193Z [1669166990.674346] [d8f8c46cdf70:9670 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8700436Z [1669166991.452213] [d8f8c46cdf70:9670 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8700684Z [1669166991.452213] [d8f8c46cdf70:9670 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8700962Z [1669166990.672271] [d8f8c46cdf70:9669 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8701260Z [1669166991.480000] [d8f8c46cdf70:9669 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8701509Z [1669166991.480000] [d8f8c46cdf70:9669 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8702417Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8703309Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8704528Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T01:43:17.8704776Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T01:43:17.8705936Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T01:43:17.8706179Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T01:43:17.8706426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8706669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8707564Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8708456Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8709341Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8710213Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8711148Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8712023Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8712888Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8713806Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8714684Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8715777Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:43:17.8715892Z ok (5.512s) 2022-11-23T01:43:17.8715915Z 2022-11-23T01:43:17.8716193Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8716309Z Ran 1 test in 5.513s 2022-11-23T01:43:17.8716330Z 2022-11-23T01:43:17.8716407Z OK 2022-11-23T01:43:17.8716446Z 2022-11-23T01:43:17.8716555Z Generating XML reports... 2022-11-23T01:43:17.8717126Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012946.xml 2022-11-23T01:43:17.8717517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8717698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8718086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8718288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8718308Z 2022-11-23T01:43:17.8718419Z Running tests... 2022-11-23T01:43:17.8718690Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8718987Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8719243Z test_ddp_device (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8719996Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77324 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.628s) 2022-11-23T01:43:17.8720152Z 2022-11-23T01:43:17.8720432Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8720547Z Ran 1 test in 1.628s 2022-11-23T01:43:17.8720567Z 2022-11-23T01:43:17.8720682Z OK (skipped=1) 2022-11-23T01:43:17.8720702Z 2022-11-23T01:43:17.8720828Z Generating XML reports... 2022-11-23T01:43:17.8721323Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012954.xml 2022-11-23T01:43:17.8721709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8721892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8722261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8722464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8722485Z 2022-11-23T01:43:17.8722596Z Running tests... 2022-11-23T01:43:17.8722866Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8723253Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8723544Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8723767Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9817 2022-11-23T01:43:17.8723986Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9818 2022-11-23T01:43:17.8724353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8724534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8724928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8725124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8725512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8725693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8726079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8726274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8726510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8726761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8727173Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8727580Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8727819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8728052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8728313Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprkobl_3t 2022-11-23T01:43:17.8728587Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprkobl_3t/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8728844Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqr_6d_q7 2022-11-23T01:43:17.8729096Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqr_6d_q7/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8729957Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T01:43:17.8730302Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T01:43:17.8731100Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T01:43:17.8731438Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T01:43:17.8731726Z [1669167003.572102] [d8f8c46cdf70:9818 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8731966Z [1669167003.578791] [d8f8c46cdf70:9818 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8732293Z [1669167003.578791] [d8f8c46cdf70:9818 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8732583Z [1669167003.570710] [d8f8c46cdf70:9817 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8732823Z [1669167003.577379] [d8f8c46cdf70:9817 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8733069Z [1669167003.577379] [d8f8c46cdf70:9817 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8733182Z ok (6.012s) 2022-11-23T01:43:17.8733202Z 2022-11-23T01:43:17.8733460Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8733584Z Ran 1 test in 6.013s 2022-11-23T01:43:17.8733604Z 2022-11-23T01:43:17.8733697Z OK 2022-11-23T01:43:17.8733717Z 2022-11-23T01:43:17.8733841Z Generating XML reports... 2022-11-23T01:43:17.8734299Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012958.xml 2022-11-23T01:43:17.8734682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8734864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8735257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8735455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8735479Z 2022-11-23T01:43:17.8735573Z Running tests... 2022-11-23T01:43:17.8735845Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8736164Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8736442Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8737202Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78685 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.639s) 2022-11-23T01:43:17.8737224Z 2022-11-23T01:43:17.8737491Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8737609Z Ran 1 test in 1.639s 2022-11-23T01:43:17.8737629Z 2022-11-23T01:43:17.8737740Z OK (skipped=1) 2022-11-23T01:43:17.8737813Z 2022-11-23T01:43:17.8737947Z Generating XML reports... 2022-11-23T01:43:17.8738382Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013007.xml 2022-11-23T01:43:17.8738763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8738951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8739336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8739533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8739553Z 2022-11-23T01:43:17.8739665Z Running tests... 2022-11-23T01:43:17.8739933Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8740252Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8740536Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8741315Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77293 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.607s) 2022-11-23T01:43:17.8741360Z 2022-11-23T01:43:17.8741615Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8741730Z Ran 1 test in 1.608s 2022-11-23T01:43:17.8741750Z 2022-11-23T01:43:17.8741859Z OK (skipped=1) 2022-11-23T01:43:17.8741878Z 2022-11-23T01:43:17.8742004Z Generating XML reports... 2022-11-23T01:43:17.8742455Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013011.xml 2022-11-23T01:43:17.8742833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8743020Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8743407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8743602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8743627Z 2022-11-23T01:43:17.8743719Z Running tests... 2022-11-23T01:43:17.8743985Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8744303Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8744598Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8744822Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10033 2022-11-23T01:43:17.8745044Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10034 2022-11-23T01:43:17.8745430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8745612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8745991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8746190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8746565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8746745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8747127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8747311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8747631Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8747880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8748290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8748681Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8748917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8749148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8749392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8749637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8750043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8750444Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8750756Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg0zrzeiu 2022-11-23T01:43:17.8751041Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg0zrzeiu/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8751282Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptds2n2lz 2022-11-23T01:43:17.8751554Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptds2n2lz/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8751837Z [1669167020.247496] [d8f8c46cdf70:10033:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8752085Z [1669167020.254402] [d8f8c46cdf70:10033:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8752334Z [1669167020.254402] [d8f8c46cdf70:10033:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8752617Z [1669167020.248859] [d8f8c46cdf70:10034:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8752857Z [1669167020.254340] [d8f8c46cdf70:10034:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8753099Z [1669167020.254340] [d8f8c46cdf70:10034:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8753341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8753581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8753809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8754048Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8754153Z ok (6.116s) 2022-11-23T01:43:17.8754174Z 2022-11-23T01:43:17.8754455Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8754573Z Ran 1 test in 6.116s 2022-11-23T01:43:17.8754593Z 2022-11-23T01:43:17.8754687Z OK 2022-11-23T01:43:17.8754706Z 2022-11-23T01:43:17.8754831Z Generating XML reports... 2022-11-23T01:43:17.8755523Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013015.xml 2022-11-23T01:43:17.8755898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8756082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8756564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8756762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8756784Z 2022-11-23T01:43:17.8756894Z Running tests... 2022-11-23T01:43:17.8757172Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8757486Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8757770Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8757975Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10151 2022-11-23T01:43:17.8758197Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10152 2022-11-23T01:43:17.8758576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8758762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8759155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8759350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8759790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8759982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8760375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8760554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8760804Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8761057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8761465Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8761876Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8762114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8762399Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:43:17.8762627Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8762902Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:43:17.8763147Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8osln4ov 2022-11-23T01:43:17.8763426Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8osln4ov/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8763684Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprxhyhacb 2022-11-23T01:43:17.8763961Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprxhyhacb/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8764243Z [1669167029.061780] [d8f8c46cdf70:10152:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8764482Z [1669167029.068854] [d8f8c46cdf70:10152:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8764729Z [1669167029.068854] [d8f8c46cdf70:10152:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8765010Z [1669167029.054721] [d8f8c46cdf70:10151:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8765314Z [1669167029.060161] [d8f8c46cdf70:10151:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8765568Z [1669167029.060161] [d8f8c46cdf70:10151:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8765795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8766035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8766273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8766512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8766797Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:43:17.8767083Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:43:17.8767361Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:43:17.8767682Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:43:17.8767932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8768150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8768387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8768622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8768902Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:43:17.8769186Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:43:17.8769467Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T01:43:17.8769741Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T01:43:17.8769977Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8770199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8770438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8770670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8770773Z ok (6.766s) 2022-11-23T01:43:17.8770794Z 2022-11-23T01:43:17.8771079Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8771197Z Ran 1 test in 6.766s 2022-11-23T01:43:17.8771218Z 2022-11-23T01:43:17.8771312Z OK 2022-11-23T01:43:17.8771332Z 2022-11-23T01:43:17.8771457Z Generating XML reports... 2022-11-23T01:43:17.8771919Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013024.xml 2022-11-23T01:43:17.8772286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8772468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8772856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8773052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8773074Z 2022-11-23T01:43:17.8773185Z Running tests... 2022-11-23T01:43:17.8773520Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8773836Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8774112Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8774872Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77378 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.611s) 2022-11-23T01:43:17.8774894Z 2022-11-23T01:43:17.8775162Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8775260Z Ran 1 test in 1.611s 2022-11-23T01:43:17.8775281Z 2022-11-23T01:43:17.8775391Z OK (skipped=1) 2022-11-23T01:43:17.8775411Z 2022-11-23T01:43:17.8775537Z Generating XML reports... 2022-11-23T01:43:17.8775992Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013033.xml 2022-11-23T01:43:17.8776373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8776554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8776990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8777198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8777219Z 2022-11-23T01:43:17.8777312Z Running tests... 2022-11-23T01:43:17.8777579Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8777897Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8778177Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8778409Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10303 2022-11-23T01:43:17.8778632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10304 2022-11-23T01:43:17.8779013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8779197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8779586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8779764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8780137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8780316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8780708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8780904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8781153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8781403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8781814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8782203Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8782440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8782993Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T01:43:17.8783285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8783839Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T01:43:17.8784101Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbgfu7590 2022-11-23T01:43:17.8784374Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbgfu7590/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8784632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9gfonefx 2022-11-23T01:43:17.8784907Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9gfonefx/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8785145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8785432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8785705Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T01:43:17.8785985Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T01:43:17.8786289Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T01:43:17.8786625Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T01:43:17.8786931Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T01:43:17.8787259Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T01:43:17.8787596Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T01:43:17.8787925Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T01:43:17.8788207Z [1669167042.479340] [d8f8c46cdf70:10303:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8788448Z [1669167042.484989] [d8f8c46cdf70:10303:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8788703Z [1669167042.484989] [d8f8c46cdf70:10303:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8788967Z [1669167042.479402] [d8f8c46cdf70:10304:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8789205Z [1669167042.484877] [d8f8c46cdf70:10304:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8789451Z [1669167042.484877] [d8f8c46cdf70:10304:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8789695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8789933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8790037Z ok (6.067s) 2022-11-23T01:43:17.8790119Z 2022-11-23T01:43:17.8790406Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8790525Z Ran 1 test in 6.067s 2022-11-23T01:43:17.8790546Z 2022-11-23T01:43:17.8790639Z OK 2022-11-23T01:43:17.8790659Z 2022-11-23T01:43:17.8790767Z Generating XML reports... 2022-11-23T01:43:17.8791231Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013037.xml 2022-11-23T01:43:17.8791614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8791798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8792187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8792386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8792407Z 2022-11-23T01:43:17.8792521Z Running tests... 2022-11-23T01:43:17.8792793Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8793092Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8793542Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8793566Z 2022-11-23T01:43:17.8793837Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8793953Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8793973Z 2022-11-23T01:43:17.8794085Z OK (skipped=1) 2022-11-23T01:43:17.8794105Z 2022-11-23T01:43:17.8794231Z Generating XML reports... 2022-11-23T01:43:17.8794684Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013046.xml 2022-11-23T01:43:17.8795292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8795495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8795874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8796072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8796098Z 2022-11-23T01:43:17.8796212Z Running tests... 2022-11-23T01:43:17.8796478Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8796795Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8797190Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8797211Z 2022-11-23T01:43:17.8797475Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8797594Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8797614Z 2022-11-23T01:43:17.8797725Z OK (skipped=1) 2022-11-23T01:43:17.8797744Z 2022-11-23T01:43:17.8797852Z Generating XML reports... 2022-11-23T01:43:17.8798301Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013048.xml 2022-11-23T01:43:17.8798684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8798865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8799252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8799448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8799468Z 2022-11-23T01:43:17.8799578Z Running tests... 2022-11-23T01:43:17.8799943Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8800258Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8800696Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8800739Z 2022-11-23T01:43:17.8800986Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8801101Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8801121Z 2022-11-23T01:43:17.8801232Z OK (skipped=1) 2022-11-23T01:43:17.8801252Z 2022-11-23T01:43:17.8801376Z Generating XML reports... 2022-11-23T01:43:17.8801820Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013051.xml 2022-11-23T01:43:17.8802192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8802379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8802765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8802941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8803043Z 2022-11-23T01:43:17.8803144Z Running tests... 2022-11-23T01:43:17.8803414Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8803731Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8804183Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8804208Z 2022-11-23T01:43:17.8804475Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8804590Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8804610Z 2022-11-23T01:43:17.8804722Z OK (skipped=1) 2022-11-23T01:43:17.8804742Z 2022-11-23T01:43:17.8804866Z Generating XML reports... 2022-11-23T01:43:17.8805317Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013053.xml 2022-11-23T01:43:17.8805677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8805857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8806241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8806433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8806453Z 2022-11-23T01:43:17.8806567Z Running tests... 2022-11-23T01:43:17.8806831Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8807146Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8807598Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8807619Z 2022-11-23T01:43:17.8807881Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8807978Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8807998Z 2022-11-23T01:43:17.8808109Z OK (skipped=1) 2022-11-23T01:43:17.8808128Z 2022-11-23T01:43:17.8808252Z Generating XML reports... 2022-11-23T01:43:17.8808696Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013055.xml 2022-11-23T01:43:17.8809151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8809332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8809711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8809911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8809933Z 2022-11-23T01:43:17.8810043Z Running tests... 2022-11-23T01:43:17.8810289Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8810605Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8811054Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8811079Z 2022-11-23T01:43:17.8811348Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8811462Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8811482Z 2022-11-23T01:43:17.8811592Z OK (skipped=1) 2022-11-23T01:43:17.8811611Z 2022-11-23T01:43:17.8811738Z Generating XML reports... 2022-11-23T01:43:17.8812248Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013058.xml 2022-11-23T01:43:17.8812639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8812801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8813188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8813386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8813411Z 2022-11-23T01:43:17.8813523Z Running tests... 2022-11-23T01:43:17.8813790Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8814110Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8814568Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8814589Z 2022-11-23T01:43:17.8814857Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8814972Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8814991Z 2022-11-23T01:43:17.8815083Z OK (skipped=1) 2022-11-23T01:43:17.8815123Z 2022-11-23T01:43:17.8815229Z Generating XML reports... 2022-11-23T01:43:17.8815680Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013100.xml 2022-11-23T01:43:17.8816065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8816246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8816630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8816832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8816852Z 2022-11-23T01:43:17.8816963Z Running tests... 2022-11-23T01:43:17.8817232Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8817532Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8817978Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8818054Z 2022-11-23T01:43:17.8818332Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8818448Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8818468Z 2022-11-23T01:43:17.8818575Z OK (skipped=1) 2022-11-23T01:43:17.8818595Z 2022-11-23T01:43:17.8818719Z Generating XML reports... 2022-11-23T01:43:17.8819176Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013103.xml 2022-11-23T01:43:17.8819557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8819738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8820109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8820305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8820329Z 2022-11-23T01:43:17.8820439Z Running tests... 2022-11-23T01:43:17.8820710Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8821028Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8821572Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8821598Z 2022-11-23T01:43:17.8821878Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8821994Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8822014Z 2022-11-23T01:43:17.8822123Z OK (skipped=1) 2022-11-23T01:43:17.8822143Z 2022-11-23T01:43:17.8822268Z Generating XML reports... 2022-11-23T01:43:17.8822703Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013105.xml 2022-11-23T01:43:17.8823093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8823270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8823659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8823859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8823879Z 2022-11-23T01:43:17.8823987Z Running tests... 2022-11-23T01:43:17.8824254Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8824569Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8825016Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8825040Z 2022-11-23T01:43:17.8825294Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8825410Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8825429Z 2022-11-23T01:43:17.8825539Z OK (skipped=1) 2022-11-23T01:43:17.8825558Z 2022-11-23T01:43:17.8825685Z Generating XML reports... 2022-11-23T01:43:17.8826136Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013107.xml 2022-11-23T01:43:17.8826514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8826693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8827079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8827272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8827346Z 2022-11-23T01:43:17.8827446Z Running tests... 2022-11-23T01:43:17.8827715Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8828033Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8828431Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8828452Z 2022-11-23T01:43:17.8828713Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8828827Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8828847Z 2022-11-23T01:43:17.8828958Z OK (skipped=1) 2022-11-23T01:43:17.8828977Z 2022-11-23T01:43:17.8829103Z Generating XML reports... 2022-11-23T01:43:17.8829551Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013110.xml 2022-11-23T01:43:17.8829911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8830092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8830527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8830729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8830750Z 2022-11-23T01:43:17.8830859Z Running tests... 2022-11-23T01:43:17.8831126Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8831442Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8831830Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:43:17.8831856Z 2022-11-23T01:43:17.8832118Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8832216Z Ran 1 test in 0.002s 2022-11-23T01:43:17.8832235Z 2022-11-23T01:43:17.8832343Z OK (skipped=1) 2022-11-23T01:43:17.8832363Z 2022-11-23T01:43:17.8832488Z Generating XML reports... 2022-11-23T01:43:17.8832941Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013112.xml 2022-11-23T01:43:17.8833315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8833495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8833877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8834073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8834098Z 2022-11-23T01:43:17.8834209Z Running tests... 2022-11-23T01:43:17.8834459Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8834772Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8835259Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8836035Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77325 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.652s) 2022-11-23T01:43:17.8836058Z 2022-11-23T01:43:17.8836327Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8836443Z Ran 1 test in 1.652s 2022-11-23T01:43:17.8836464Z 2022-11-23T01:43:17.8836575Z OK (skipped=1) 2022-11-23T01:43:17.8836681Z 2022-11-23T01:43:17.8836818Z Generating XML reports... 2022-11-23T01:43:17.8837275Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013115.xml 2022-11-23T01:43:17.8837637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8837826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8838217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8838416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8838437Z 2022-11-23T01:43:17.8838545Z Running tests... 2022-11-23T01:43:17.8838812Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8839133Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8839397Z test_ddp_inference (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8839620Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10851 2022-11-23T01:43:17.8839825Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10852 2022-11-23T01:43:17.8840319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8840515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8840911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8841106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8841481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8841664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8842048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8842228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8842477Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8842730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8843142Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8843546Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8843781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8844017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8844279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpenfo4wo_ 2022-11-23T01:43:17.8844550Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpenfo4wo_/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8844799Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaakcqwih 2022-11-23T01:43:17.8845074Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaakcqwih/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8845357Z [1669167084.064565] [d8f8c46cdf70:10851:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8845599Z [1669167084.070574] [d8f8c46cdf70:10851:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8845846Z [1669167084.070574] [d8f8c46cdf70:10851:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8846189Z [1669167084.076575] [d8f8c46cdf70:10852:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8846426Z [1669167084.082099] [d8f8c46cdf70:10852:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8846675Z [1669167084.082099] [d8f8c46cdf70:10852:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8846783Z ok (6.269s) 2022-11-23T01:43:17.8846805Z 2022-11-23T01:43:17.8847083Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8847181Z Ran 1 test in 6.269s 2022-11-23T01:43:17.8847200Z 2022-11-23T01:43:17.8847296Z OK 2022-11-23T01:43:17.8847316Z 2022-11-23T01:43:17.8847443Z Generating XML reports... 2022-11-23T01:43:17.8847897Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013119.xml 2022-11-23T01:43:17.8848281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8848461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8848896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8849102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8849122Z 2022-11-23T01:43:17.8849232Z Running tests... 2022-11-23T01:43:17.8849487Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8849805Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8850083Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8850311Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10965 2022-11-23T01:43:17.8850534Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10966 2022-11-23T01:43:17.8850914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8851097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8851482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8851661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8852036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8852214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8852596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8852794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8853044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8853296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8853709Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8854114Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8854332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8854565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8854824Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8fubobjl 2022-11-23T01:43:17.8855163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8fubobjl/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8855421Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn2tredrj 2022-11-23T01:43:17.8855696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn2tredrj/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8855936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8856175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8856441Z [1669167093.332535] [d8f8c46cdf70:10966:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8856680Z [1669167093.339523] [d8f8c46cdf70:10966:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8856929Z [1669167093.339523] [d8f8c46cdf70:10966:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8857208Z [1669167093.324512] [d8f8c46cdf70:10965:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8857539Z [1669167093.332197] [d8f8c46cdf70:10965:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8857795Z [1669167093.332197] [d8f8c46cdf70:10965:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8858208Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T01:43:17.8858377Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T01:43:17.8858780Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T01:43:17.8858953Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T01:43:17.8859042Z ok (5.971s) 2022-11-23T01:43:17.8859062Z 2022-11-23T01:43:17.8859331Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8859446Z Ran 1 test in 5.971s 2022-11-23T01:43:17.8859466Z 2022-11-23T01:43:17.8859558Z OK 2022-11-23T01:43:17.8859578Z 2022-11-23T01:43:17.8859706Z Generating XML reports... 2022-11-23T01:43:17.8860164Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013128.xml 2022-11-23T01:43:17.8860542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8860725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8861095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8861300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8861321Z 2022-11-23T01:43:17.8861434Z Running tests... 2022-11-23T01:43:17.8861702Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8862018Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8862290Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8862514Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11083 2022-11-23T01:43:17.8862733Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11084 2022-11-23T01:43:17.8863116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8863278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8863663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8863922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8864303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8864487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8864876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8865071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8865319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8865554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8865963Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8866374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8866610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8866890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8867162Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbststpxg 2022-11-23T01:43:17.8867424Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplniqkpvu 2022-11-23T01:43:17.8867700Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbststpxg/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8867969Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplniqkpvu/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8868192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8868440Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8868725Z [1669167100.633695] [d8f8c46cdf70:11084:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8868968Z [1669167101.418123] [d8f8c46cdf70:11084:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8869215Z [1669167101.418123] [d8f8c46cdf70:11084:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8869494Z [1669167100.633444] [d8f8c46cdf70:11083:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8869729Z [1669167101.443486] [d8f8c46cdf70:11083:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8869977Z [1669167101.443486] [d8f8c46cdf70:11083:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8870085Z ok (5.454s) 2022-11-23T01:43:17.8870106Z 2022-11-23T01:43:17.8870386Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8870486Z Ran 1 test in 5.454s 2022-11-23T01:43:17.8870506Z 2022-11-23T01:43:17.8870616Z OK 2022-11-23T01:43:17.8870636Z 2022-11-23T01:43:17.8870764Z Generating XML reports... 2022-11-23T01:43:17.8871218Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013136.xml 2022-11-23T01:43:17.8871597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8871778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8872163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8872432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8872453Z 2022-11-23T01:43:17.8872545Z Running tests... 2022-11-23T01:43:17.8872815Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8873134Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8873403Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8873625Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11227 2022-11-23T01:43:17.8873848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11228 2022-11-23T01:43:17.8874228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8874407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8874796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8874973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8875588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8875844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8876245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8876441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8876692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8876942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8877355Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8877746Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8877982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8878217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8878480Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo8pldlbz 2022-11-23T01:43:17.8878756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo8pldlbz/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8879013Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm_700f_7 2022-11-23T01:43:17.8879282Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm_700f_7/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8879571Z [1669167109.407073] [d8f8c46cdf70:11228:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8879812Z [1669167109.412608] [d8f8c46cdf70:11228:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8880062Z [1669167109.412608] [d8f8c46cdf70:11228:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8880328Z [1669167109.403292] [d8f8c46cdf70:11227:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8880564Z [1669167109.411075] [d8f8c46cdf70:11227:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8880807Z [1669167109.411075] [d8f8c46cdf70:11227:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8881126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8881369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8881475Z ok (5.943s) 2022-11-23T01:43:17.8881497Z 2022-11-23T01:43:17.8881777Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8881899Z Ran 1 test in 5.943s 2022-11-23T01:43:17.8881919Z 2022-11-23T01:43:17.8882015Z OK 2022-11-23T01:43:17.8882034Z 2022-11-23T01:43:17.8882143Z Generating XML reports... 2022-11-23T01:43:17.8882601Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013144.xml 2022-11-23T01:43:17.8882986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8883169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8883561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8883759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8883779Z 2022-11-23T01:43:17.8883893Z Running tests... 2022-11-23T01:43:17.8884164Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8884513Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8884815Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8885039Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11345 2022-11-23T01:43:17.8885260Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11346 2022-11-23T01:43:17.8885638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8885823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8886208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8886403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8886784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8886948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8887335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8887531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8887783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8888037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8888443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8888845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8889085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8889317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8889545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8889787Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8890189Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8890649Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8890895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.8891143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.8891537Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.8891930Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.8892192Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1qdf53j3 2022-11-23T01:43:17.8892449Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1qdf53j3/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8892711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7doof22h 2022-11-23T01:43:17.8892984Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7doof22h/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8893269Z [1669167117.977464] [d8f8c46cdf70:11346:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8893560Z [1669167117.984008] [d8f8c46cdf70:11346:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8893820Z [1669167117.984008] [d8f8c46cdf70:11346:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8894100Z [1669167117.972814] [d8f8c46cdf70:11345:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8894339Z [1669167117.979353] [d8f8c46cdf70:11345:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8894587Z [1669167117.979353] [d8f8c46cdf70:11345:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8894674Z ok (5.559s) 2022-11-23T01:43:17.8894713Z 2022-11-23T01:43:17.8894972Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8895094Z Ran 1 test in 5.559s 2022-11-23T01:43:17.8895114Z 2022-11-23T01:43:17.8895211Z OK 2022-11-23T01:43:17.8895232Z 2022-11-23T01:43:17.8895357Z Generating XML reports... 2022-11-23T01:43:17.8895809Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013153.xml 2022-11-23T01:43:17.8896187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8896369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8896758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8896943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8896980Z 2022-11-23T01:43:17.8897076Z Running tests... 2022-11-23T01:43:17.8897344Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8897666Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8897954Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8898179Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11465 2022-11-23T01:43:17.8898401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11466 2022-11-23T01:43:17.8898777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8899016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8899387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8899584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8899963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8900142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8900525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8900718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8900967Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8901216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8901612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8902018Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8902304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8902549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8902794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.8903037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.8903441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8903846Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.8904092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.8904319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.8904718Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.8905113Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.8905379Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkvghbcs5 2022-11-23T01:43:17.8905654Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkvghbcs5/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8905911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbclqrgo1 2022-11-23T01:43:17.8906185Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbclqrgo1/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8906468Z [1669167126.101311] [d8f8c46cdf70:11465:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8906713Z [1669167126.107488] [d8f8c46cdf70:11465:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8906963Z [1669167126.107488] [d8f8c46cdf70:11465:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8907274Z [1669167136.477150] [d8f8c46cdf70:11465:1] ucc_schedule.h:189 UCC WARN timeout 10 sec. has expired on req 0x56427f98b800, seq_num 3, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T01:43:17.8907563Z [1669167136.513688] [d8f8c46cdf70:11465:0] mpool.c:55 UCX WARN object 0x56427fa9cdc0 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T01:43:17.8907903Z [1669167126.110784] [d8f8c46cdf70:11466:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8908144Z [1669167126.117207] [d8f8c46cdf70:11466:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8908386Z [1669167126.117207] [d8f8c46cdf70:11466:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8908788Z [1669167136.523764] [d8f8c46cdf70:11466:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x5631764897c0 was not matched 2022-11-23T01:43:17.8908895Z ok (15.633s) 2022-11-23T01:43:17.8908915Z 2022-11-23T01:43:17.8909194Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8909311Z Ran 1 test in 15.633s 2022-11-23T01:43:17.8909334Z 2022-11-23T01:43:17.8909429Z OK 2022-11-23T01:43:17.8909449Z 2022-11-23T01:43:17.8909557Z Generating XML reports... 2022-11-23T01:43:17.8910012Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013201.xml 2022-11-23T01:43:17.8910443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8910632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8911023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8911220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8911241Z 2022-11-23T01:43:17.8911352Z Running tests... 2022-11-23T01:43:17.8911620Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8911920Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8912237Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8912466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11585 2022-11-23T01:43:17.8912693Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11586 2022-11-23T01:43:17.8913073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8913255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8913641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8913836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8914213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8914380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8914767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8914962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8915423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8915681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8916094Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8916498Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8916735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8917054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8917300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp2zyjpvg 2022-11-23T01:43:17.8917578Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp2zyjpvg/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8917836Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbd174620 2022-11-23T01:43:17.8918107Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbd174620/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8918388Z [1669167144.222482] [d8f8c46cdf70:11586:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8918630Z [1669167144.229710] [d8f8c46cdf70:11586:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8918882Z [1669167144.229710] [d8f8c46cdf70:11586:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8919163Z [1669167144.217058] [d8f8c46cdf70:11585:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8919462Z [1669167144.223112] [d8f8c46cdf70:11585:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8919722Z [1669167144.223112] [d8f8c46cdf70:11585:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8919809Z ok (6.137s) 2022-11-23T01:43:17.8919832Z 2022-11-23T01:43:17.8920113Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8920229Z Ran 1 test in 6.137s 2022-11-23T01:43:17.8920250Z 2022-11-23T01:43:17.8920343Z OK 2022-11-23T01:43:17.8920362Z 2022-11-23T01:43:17.8920494Z Generating XML reports... 2022-11-23T01:43:17.8920950Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013219.xml 2022-11-23T01:43:17.8921368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8921550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8921925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8922121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8922141Z 2022-11-23T01:43:17.8922250Z Running tests... 2022-11-23T01:43:17.8922522Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8922840Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8923135Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8923364Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11703 2022-11-23T01:43:17.8923590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11704 2022-11-23T01:43:17.8923972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8924136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8924524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8924721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8925099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8925277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8925732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8925928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8926177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8926413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8926819Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8927223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8927457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8927690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8927957Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprsl5chgi 2022-11-23T01:43:17.8928233Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprsl5chgi/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8928490Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppjy7j0h2 2022-11-23T01:43:17.8928809Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppjy7j0h2/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8929084Z [1669167152.786061] [d8f8c46cdf70:11704:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8929328Z [1669167152.791571] [d8f8c46cdf70:11704:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8929578Z [1669167152.791571] [d8f8c46cdf70:11704:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8929865Z [1669167152.777521] [d8f8c46cdf70:11703:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8930103Z [1669167152.783224] [d8f8c46cdf70:11703:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8930350Z [1669167152.783224] [d8f8c46cdf70:11703:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8930456Z ok (6.100s) 2022-11-23T01:43:17.8930476Z 2022-11-23T01:43:17.8930754Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8930872Z Ran 1 test in 6.101s 2022-11-23T01:43:17.8930892Z 2022-11-23T01:43:17.8930987Z OK 2022-11-23T01:43:17.8931006Z 2022-11-23T01:43:17.8931114Z Generating XML reports... 2022-11-23T01:43:17.8931572Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013228.xml 2022-11-23T01:43:17.8931961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8932144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8932532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8932729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8932750Z 2022-11-23T01:43:17.8932862Z Running tests... 2022-11-23T01:43:17.8933131Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8933430Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8933693Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8933917Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11821 2022-11-23T01:43:17.8934216Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11822 2022-11-23T01:43:17.8934597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8934778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8935167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8935361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8935734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8935894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8936277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8936474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8936721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8936968Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8937430Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8937846Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8938083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8938318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8938559Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0_dyyqef 2022-11-23T01:43:17.8938838Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0_dyyqef/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8939097Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpba6bjvuc 2022-11-23T01:43:17.8939367Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpba6bjvuc/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8939652Z [1669167161.556848] [d8f8c46cdf70:11821:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8939894Z [1669167161.564342] [d8f8c46cdf70:11821:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8940142Z [1669167161.564342] [d8f8c46cdf70:11821:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8940419Z [1669167161.557513] [d8f8c46cdf70:11822:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8940661Z [1669167161.564368] [d8f8c46cdf70:11822:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8940887Z [1669167161.564368] [d8f8c46cdf70:11822:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8940992Z ok (6.048s) 2022-11-23T01:43:17.8941016Z 2022-11-23T01:43:17.8941288Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8941406Z Ran 1 test in 6.049s 2022-11-23T01:43:17.8941427Z 2022-11-23T01:43:17.8941522Z OK 2022-11-23T01:43:17.8941541Z 2022-11-23T01:43:17.8941667Z Generating XML reports... 2022-11-23T01:43:17.8942120Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013236.xml 2022-11-23T01:43:17.8942501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8942742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8943119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8943320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8943340Z 2022-11-23T01:43:17.8943454Z Running tests... 2022-11-23T01:43:17.8943722Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8944041Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8944311Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8944533Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11935 2022-11-23T01:43:17.8944756Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11936 2022-11-23T01:43:17.8945123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8945303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8945690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8945933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8946318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8946498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8946881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8947074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8947323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8947559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8947970Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8948381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8948617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8948845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8949103Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2zs5caif 2022-11-23T01:43:17.8949375Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2zs5caif/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8949636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyimg2vrj 2022-11-23T01:43:17.8949909Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyimg2vrj/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8950175Z [1669167170.077362] [d8f8c46cdf70:11935:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8950420Z [1669167170.083652] [d8f8c46cdf70:11935:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8950668Z [1669167170.083652] [d8f8c46cdf70:11935:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8951453Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.8951801Z [1669167170.084217] [d8f8c46cdf70:11936:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8952042Z [1669167170.089933] [d8f8c46cdf70:11936:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8952288Z [1669167170.089933] [d8f8c46cdf70:11936:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8953069Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.8953361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8953611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8953715Z ok (5.957s) 2022-11-23T01:43:17.8953735Z 2022-11-23T01:43:17.8954010Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8954126Z Ran 1 test in 5.957s 2022-11-23T01:43:17.8954146Z 2022-11-23T01:43:17.8954223Z OK 2022-11-23T01:43:17.8954242Z 2022-11-23T01:43:17.8954369Z Generating XML reports... 2022-11-23T01:43:17.8954824Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013245.xml 2022-11-23T01:43:17.8955427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8955615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8956014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8956212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8956233Z 2022-11-23T01:43:17.8956342Z Running tests... 2022-11-23T01:43:17.8956592Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8956908Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8957195Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8957962Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78338 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.651s) 2022-11-23T01:43:17.8957984Z 2022-11-23T01:43:17.8958257Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8958372Z Ran 1 test in 1.652s 2022-11-23T01:43:17.8958392Z 2022-11-23T01:43:17.8958502Z OK (skipped=1) 2022-11-23T01:43:17.8958523Z 2022-11-23T01:43:17.8958649Z Generating XML reports... 2022-11-23T01:43:17.8959103Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013253.xml 2022-11-23T01:43:17.8959483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8959648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8960133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8960328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8960349Z 2022-11-23T01:43:17.8960459Z Running tests... 2022-11-23T01:43:17.8960731Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8961049Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8961338Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8962090Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77342 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.656s) 2022-11-23T01:43:17.8962116Z 2022-11-23T01:43:17.8962386Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8962503Z Ran 1 test in 1.657s 2022-11-23T01:43:17.8962523Z 2022-11-23T01:43:17.8962614Z OK (skipped=1) 2022-11-23T01:43:17.8962634Z 2022-11-23T01:43:17.8962759Z Generating XML reports... 2022-11-23T01:43:17.8963276Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013258.xml 2022-11-23T01:43:17.8963668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8963848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8964234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8964433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8964458Z 2022-11-23T01:43:17.8964568Z Running tests... 2022-11-23T01:43:17.8964818Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8965136Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8965417Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8965645Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12121 2022-11-23T01:43:17.8965870Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12122 2022-11-23T01:43:17.8966254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8966434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8966818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8967018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8967376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8967555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8967940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8968133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8968382Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8968634Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8969038Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8969509Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8969743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8969958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8970225Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5j90od3s 2022-11-23T01:43:17.8970499Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5j90od3s/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8970755Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8d32yf7f 2022-11-23T01:43:17.8971024Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8d32yf7f/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8971306Z [1669167187.053891] [d8f8c46cdf70:12122:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8971553Z [1669167187.060212] [d8f8c46cdf70:12122:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8971801Z [1669167187.060212] [d8f8c46cdf70:12122:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8972201Z STAGE:2022-11-23 01:33:07 12122:12122 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8972474Z [1669167187.045219] [d8f8c46cdf70:12121:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8972715Z [1669167187.051563] [d8f8c46cdf70:12121:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8972959Z [1669167187.051563] [d8f8c46cdf70:12121:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8973318Z STAGE:2022-11-23 01:33:07 12121:12121 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8973563Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8973808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.8974159Z STAGE:2022-11-23 01:33:08 12122:12122 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8974497Z STAGE:2022-11-23 01:33:08 12121:12121 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8974853Z STAGE:2022-11-23 01:33:08 12121:12121 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8975190Z STAGE:2022-11-23 01:33:08 12122:12122 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8975978Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.8976786Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.8977129Z STAGE:2022-11-23 01:33:08 12122:12122 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8977499Z STAGE:2022-11-23 01:33:08 12121:12121 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.8977843Z STAGE:2022-11-23 01:33:08 12122:12122 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8978202Z STAGE:2022-11-23 01:33:08 12122:12122 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8978539Z STAGE:2022-11-23 01:33:08 12121:12121 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.8978891Z STAGE:2022-11-23 01:33:08 12121:12121 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.8979043Z ok (6.760s) 2022-11-23T01:43:17.8979064Z 2022-11-23T01:43:17.8979335Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8979451Z Ran 1 test in 6.760s 2022-11-23T01:43:17.8979472Z 2022-11-23T01:43:17.8979566Z OK 2022-11-23T01:43:17.8979590Z 2022-11-23T01:43:17.8979697Z Generating XML reports... 2022-11-23T01:43:17.8980156Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013302.xml 2022-11-23T01:43:17.8980537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8980768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8981167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8981364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8981385Z 2022-11-23T01:43:17.8981496Z Running tests... 2022-11-23T01:43:17.8981763Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8982062Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8982344Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8982568Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12243 2022-11-23T01:43:17.8982792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12244 2022-11-23T01:43:17.8983175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8983355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8983740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8983936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8984313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8984474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8984864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8985060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8985310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.8985563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.8985973Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8986377Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.8986613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.8986847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.8987157Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3d1qeabs 2022-11-23T01:43:17.8987433Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3d1qeabs/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8987696Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxv35iz5u 2022-11-23T01:43:17.8987970Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxv35iz5u/_remote_module_non_scriptable.py 2022-11-23T01:43:17.8988252Z [1669167196.425367] [d8f8c46cdf70:12243:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8988491Z [1669167196.431429] [d8f8c46cdf70:12243:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8988739Z [1669167196.431429] [d8f8c46cdf70:12243:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8989024Z [1669167196.426676] [d8f8c46cdf70:12244:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.8989263Z [1669167196.431798] [d8f8c46cdf70:12244:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.8989541Z [1669167196.431798] [d8f8c46cdf70:12244:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.8989654Z ok (5.573s) 2022-11-23T01:43:17.8989675Z 2022-11-23T01:43:17.8989951Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8990068Z Ran 1 test in 5.573s 2022-11-23T01:43:17.8990088Z 2022-11-23T01:43:17.8990182Z OK 2022-11-23T01:43:17.8990202Z 2022-11-23T01:43:17.8990328Z Generating XML reports... 2022-11-23T01:43:17.8990786Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013311.xml 2022-11-23T01:43:17.8991245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8991427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8991804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8992004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8992025Z 2022-11-23T01:43:17.8992137Z Running tests... 2022-11-23T01:43:17.8992406Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8992725Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8993009Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8993767Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78595 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.626s) 2022-11-23T01:43:17.8993788Z 2022-11-23T01:43:17.8994061Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8994176Z Ran 1 test in 1.626s 2022-11-23T01:43:17.8994197Z 2022-11-23T01:43:17.8994289Z OK (skipped=1) 2022-11-23T01:43:17.8994328Z 2022-11-23T01:43:17.8994436Z Generating XML reports... 2022-11-23T01:43:17.8994890Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013319.xml 2022-11-23T01:43:17.8995485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8995667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8996207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8996406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8996429Z 2022-11-23T01:43:17.8996539Z Running tests... 2022-11-23T01:43:17.8996806Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.8997103Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.8997390Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.8997614Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12391 2022-11-23T01:43:17.8997832Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12392 2022-11-23T01:43:17.8998208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8998394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8998780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.8998973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.8999395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.8999590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.8999980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9000175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9000423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9000679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9001087Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9001489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9001724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9001940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9002202Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd2r729z4 2022-11-23T01:43:17.9002474Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd2r729z4/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9002731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk8xd99io 2022-11-23T01:43:17.9003005Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk8xd99io/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9003927Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T01:43:17.9004046Z warnings.warn( 2022-11-23T01:43:17.9004960Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T01:43:17.9005136Z warnings.warn( 2022-11-23T01:43:17.9005381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.9005623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.9005892Z [1669167208.639894] [d8f8c46cdf70:12391:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9006134Z [1669167208.645884] [d8f8c46cdf70:12391:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9006384Z [1669167208.645884] [d8f8c46cdf70:12391:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9006664Z [1669167208.639917] [d8f8c46cdf70:12392:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9006901Z [1669167208.645776] [d8f8c46cdf70:12392:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9007150Z [1669167208.645776] [d8f8c46cdf70:12392:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9007254Z ok (5.946s) 2022-11-23T01:43:17.9007275Z 2022-11-23T01:43:17.9007608Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9007733Z Ran 1 test in 5.946s 2022-11-23T01:43:17.9007755Z 2022-11-23T01:43:17.9007830Z OK 2022-11-23T01:43:17.9007869Z 2022-11-23T01:43:17.9007976Z Generating XML reports... 2022-11-23T01:43:17.9008436Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013323.xml 2022-11-23T01:43:17.9008820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9008999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9009391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9009586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9009606Z 2022-11-23T01:43:17.9009716Z Running tests... 2022-11-23T01:43:17.9009985Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9010287Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9010575Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9011334Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77625 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.638s) 2022-11-23T01:43:17.9011359Z 2022-11-23T01:43:17.9011631Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9011747Z Ran 1 test in 1.638s 2022-11-23T01:43:17.9011767Z 2022-11-23T01:43:17.9011877Z OK (skipped=1) 2022-11-23T01:43:17.9011897Z 2022-11-23T01:43:17.9012023Z Generating XML reports... 2022-11-23T01:43:17.9012479Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013332.xml 2022-11-23T01:43:17.9012862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9013043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9013413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9013616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9013691Z 2022-11-23T01:43:17.9013812Z Running tests... 2022-11-23T01:43:17.9014086Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9014405Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9014683Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9014913Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12543 2022-11-23T01:43:17.9015139Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12544 2022-11-23T01:43:17.9015500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9015682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9016068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9016269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9016643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9016822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9017267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9017469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9017724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9017956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9018370Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9018780Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9019015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9019251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9019517Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg9mcf6z8 2022-11-23T01:43:17.9019796Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg9mcf6z8/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9020054Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoyxfnsi6 2022-11-23T01:43:17.9020312Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoyxfnsi6/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9020597Z [1669167221.371483] [d8f8c46cdf70:12543:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9020847Z [1669167221.378157] [d8f8c46cdf70:12543:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9021098Z [1669167221.378157] [d8f8c46cdf70:12543:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9021495Z STAGE:2022-11-23 01:33:42 12543:12543 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9021776Z [1669167221.372965] [d8f8c46cdf70:12544:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9022014Z [1669167221.379257] [d8f8c46cdf70:12544:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9022262Z [1669167221.379257] [d8f8c46cdf70:12544:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9022613Z STAGE:2022-11-23 01:33:42 12544:12544 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9022922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.9023148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:43:17.9023507Z STAGE:2022-11-23 01:33:42 12543:12543 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9023848Z STAGE:2022-11-23 01:33:42 12544:12544 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9024203Z STAGE:2022-11-23 01:33:42 12544:12544 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9024558Z STAGE:2022-11-23 01:33:42 12543:12543 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9024896Z STAGE:2022-11-23 01:33:42 12543:12543 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9025234Z STAGE:2022-11-23 01:33:42 12543:12543 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9025586Z STAGE:2022-11-23 01:33:42 12543:12543 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9025691Z ok (6.754s) 2022-11-23T01:43:17.9025712Z 2022-11-23T01:43:17.9025963Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9026128Z Ran 1 test in 6.754s 2022-11-23T01:43:17.9026152Z 2022-11-23T01:43:17.9026253Z OK 2022-11-23T01:43:17.9026272Z 2022-11-23T01:43:17.9026399Z Generating XML reports... 2022-11-23T01:43:17.9026858Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013336.xml 2022-11-23T01:43:17.9027235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9027418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9027819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9028020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9028040Z 2022-11-23T01:43:17.9028133Z Running tests... 2022-11-23T01:43:17.9028401Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9028721Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9028989Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9029214Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12665 2022-11-23T01:43:17.9029435Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12666 2022-11-23T01:43:17.9029813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9029997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9030367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9030562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9030944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9031122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9031508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9031701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9031951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9032200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9032670Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9033055Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9033298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9033532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9033792Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmqrx09jv 2022-11-23T01:43:17.9034064Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmqrx09jv/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9034320Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0n1delft 2022-11-23T01:43:17.9034591Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0n1delft/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9034881Z [1669167230.668437] [d8f8c46cdf70:12665:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9035329Z [1669167230.675893] [d8f8c46cdf70:12665:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9035648Z [1669167230.675893] [d8f8c46cdf70:12665:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9035942Z [1669167230.669156] [d8f8c46cdf70:12666:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9036184Z [1669167230.676059] [d8f8c46cdf70:12666:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9036424Z [1669167230.676059] [d8f8c46cdf70:12666:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9036533Z ok (5.546s) 2022-11-23T01:43:17.9036555Z 2022-11-23T01:43:17.9036838Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9036956Z Ran 1 test in 5.546s 2022-11-23T01:43:17.9036975Z 2022-11-23T01:43:17.9037069Z OK 2022-11-23T01:43:17.9037089Z 2022-11-23T01:43:17.9037219Z Generating XML reports... 2022-11-23T01:43:17.9037656Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013346.xml 2022-11-23T01:43:17.9038040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9038220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9038607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9038811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9038831Z 2022-11-23T01:43:17.9038946Z Running tests... 2022-11-23T01:43:17.9039215Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9039532Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9039813Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9040019Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12779 2022-11-23T01:43:17.9040240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12780 2022-11-23T01:43:17.9040622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9040801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9041191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9041466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9041847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9042031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9042404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9042599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9042849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9043099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9043505Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9043913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9044151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9044431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9044702Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4fzkxda5 2022-11-23T01:43:17.9044960Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4fzkxda5/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9045217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpchwrc4qm 2022-11-23T01:43:17.9045489Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpchwrc4qm/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9045770Z [1669167238.712947] [d8f8c46cdf70:12780:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9046016Z [1669167238.719612] [d8f8c46cdf70:12780:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9046267Z [1669167238.719612] [d8f8c46cdf70:12780:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9046547Z [1669167238.712925] [d8f8c46cdf70:12779:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9046785Z [1669167238.719599] [d8f8c46cdf70:12779:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9047032Z [1669167238.719599] [d8f8c46cdf70:12779:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9047138Z ok (5.433s) 2022-11-23T01:43:17.9047162Z 2022-11-23T01:43:17.9047424Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9047544Z Ran 1 test in 5.433s 2022-11-23T01:43:17.9047564Z 2022-11-23T01:43:17.9047660Z OK 2022-11-23T01:43:17.9047680Z 2022-11-23T01:43:17.9047806Z Generating XML reports... 2022-11-23T01:43:17.9048265Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013354.xml 2022-11-23T01:43:17.9048648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9048830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9049216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9049394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9049432Z 2022-11-23T01:43:17.9049525Z Running tests... 2022-11-23T01:43:17.9049859Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9050176Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9050459Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9051216Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78684 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.660s) 2022-11-23T01:43:17.9051238Z 2022-11-23T01:43:17.9051508Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9051627Z Ran 1 test in 1.660s 2022-11-23T01:43:17.9051648Z 2022-11-23T01:43:17.9051758Z OK (skipped=1) 2022-11-23T01:43:17.9051777Z 2022-11-23T01:43:17.9051903Z Generating XML reports... 2022-11-23T01:43:17.9052345Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013402.xml 2022-11-23T01:43:17.9052725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9052906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9053339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9053545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9053567Z 2022-11-23T01:43:17.9053678Z Running tests... 2022-11-23T01:43:17.9053948Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9054264Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9054510Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9055267Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/75648 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.665s) 2022-11-23T01:43:17.9055309Z 2022-11-23T01:43:17.9055561Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9055681Z Ran 1 test in 1.665s 2022-11-23T01:43:17.9055701Z 2022-11-23T01:43:17.9055813Z OK (skipped=1) 2022-11-23T01:43:17.9055833Z 2022-11-23T01:43:17.9055958Z Generating XML reports... 2022-11-23T01:43:17.9056410Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013406.xml 2022-11-23T01:43:17.9056791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9056976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9057363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9057544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9057582Z 2022-11-23T01:43:17.9057676Z Running tests... 2022-11-23T01:43:17.9057946Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9058265Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9058559Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9059304Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78113 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.604s) 2022-11-23T01:43:17.9059389Z 2022-11-23T01:43:17.9059664Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9059783Z Ran 1 test in 1.604s 2022-11-23T01:43:17.9059804Z 2022-11-23T01:43:17.9059915Z OK (skipped=1) 2022-11-23T01:43:17.9059935Z 2022-11-23T01:43:17.9060065Z Generating XML reports... 2022-11-23T01:43:17.9060499Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013410.xml 2022-11-23T01:43:17.9060880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9061060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9061446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9061649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9061669Z 2022-11-23T01:43:17.9061781Z Running tests... 2022-11-23T01:43:17.9062048Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9062364Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9062700Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9062938Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12995 2022-11-23T01:43:17.9063159Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12996 2022-11-23T01:43:17.9063537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9063715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9064103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9064306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9064676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9064860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9065228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9065425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9065677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9065925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9066330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9066742Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9066976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9067215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9067477Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2pnwwiqb 2022-11-23T01:43:17.9067737Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2pnwwiqb/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9067997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2wolefz9 2022-11-23T01:43:17.9068271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2wolefz9/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9068554Z [1669167259.229105] [d8f8c46cdf70:12995:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9068858Z [1669167259.235227] [d8f8c46cdf70:12995:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9069111Z [1669167259.235227] [d8f8c46cdf70:12995:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9069393Z [1669167259.229537] [d8f8c46cdf70:12996:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9069634Z [1669167259.235484] [d8f8c46cdf70:12996:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9069875Z [1669167259.235484] [d8f8c46cdf70:12996:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9069964Z ok (5.967s) 2022-11-23T01:43:17.9070008Z 2022-11-23T01:43:17.9070266Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9070382Z Ran 1 test in 5.967s 2022-11-23T01:43:17.9070401Z 2022-11-23T01:43:17.9070498Z OK 2022-11-23T01:43:17.9070518Z 2022-11-23T01:43:17.9070643Z Generating XML reports... 2022-11-23T01:43:17.9071147Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013414.xml 2022-11-23T01:43:17.9071543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9071723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9072111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9072291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9072312Z 2022-11-23T01:43:17.9072423Z Running tests... 2022-11-23T01:43:17.9072700Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9073019Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9073292Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9073519Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13113 2022-11-23T01:43:17.9073743Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13114 2022-11-23T01:43:17.9074117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9074278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9074665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9074861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9075461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9075649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9076047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9076242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9076490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9076739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9077128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9077534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9077860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9078253Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T01:43:17.9078516Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T01:43:17.9078749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9079132Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T01:43:17.9079391Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T01:43:17.9079652Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpae919hv3 2022-11-23T01:43:17.9079909Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpae919hv3/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9080171Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl2rd4klu 2022-11-23T01:43:17.9080442Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl2rd4klu/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9080791Z [1669167267.726777] [d8f8c46cdf70:13113:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9081047Z [1669167267.734292] [d8f8c46cdf70:13113:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9081294Z [1669167267.734292] [d8f8c46cdf70:13113:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9081575Z [1669167267.733137] [d8f8c46cdf70:13114:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9081813Z [1669167267.738710] [d8f8c46cdf70:13114:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9082063Z [1669167267.738710] [d8f8c46cdf70:13114:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9082151Z ok (5.525s) 2022-11-23T01:43:17.9082189Z 2022-11-23T01:43:17.9082453Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9082570Z Ran 1 test in 5.525s 2022-11-23T01:43:17.9082590Z 2022-11-23T01:43:17.9082685Z OK 2022-11-23T01:43:17.9082705Z 2022-11-23T01:43:17.9082831Z Generating XML reports... 2022-11-23T01:43:17.9083285Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013423.xml 2022-11-23T01:43:17.9083668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9083848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9084239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9084417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9084438Z 2022-11-23T01:43:17.9084548Z Running tests... 2022-11-23T01:43:17.9084819Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9085138Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9085406Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9085629Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13227 2022-11-23T01:43:17.9085848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13228 2022-11-23T01:43:17.9086224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9086452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9086841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9087037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9087417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9087598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9087983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9088179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9088430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9088683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9089074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9089482Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9089767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9090021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9090247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9090483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9090887Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9091293Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9091400Z ok (4.323s) 2022-11-23T01:43:17.9091420Z 2022-11-23T01:43:17.9091670Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9091788Z Ran 1 test in 4.324s 2022-11-23T01:43:17.9091808Z 2022-11-23T01:43:17.9091908Z OK 2022-11-23T01:43:17.9091928Z 2022-11-23T01:43:17.9092054Z Generating XML reports... 2022-11-23T01:43:17.9092511Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013431.xml 2022-11-23T01:43:17.9092887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9093068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9093455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9093658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9093679Z 2022-11-23T01:43:17.9093773Z Running tests... 2022-11-23T01:43:17.9094039Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9094359Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9094618Z test_destroy_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9094841Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13330 2022-11-23T01:43:17.9095061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13331 2022-11-23T01:43:17.9095436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9095615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9096048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9096246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9096622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9096806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9097193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9097387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9097636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9097884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9098299Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9098687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9098924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9099215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9099451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9099688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9100087Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9100483Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9100595Z ok (4.360s) 2022-11-23T01:43:17.9100616Z 2022-11-23T01:43:17.9100884Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9100983Z Ran 1 test in 4.360s 2022-11-23T01:43:17.9101003Z 2022-11-23T01:43:17.9101097Z OK 2022-11-23T01:43:17.9101117Z 2022-11-23T01:43:17.9101247Z Generating XML reports... 2022-11-23T01:43:17.9101701Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013438.xml 2022-11-23T01:43:17.9102080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9102263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9102653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9102854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9102874Z 2022-11-23T01:43:17.9102967Z Running tests... 2022-11-23T01:43:17.9103237Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9103554Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9103838Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9104593Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78767 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.641s) 2022-11-23T01:43:17.9104614Z 2022-11-23T01:43:17.9104884Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9105000Z Ran 1 test in 1.641s 2022-11-23T01:43:17.9105074Z 2022-11-23T01:43:17.9105190Z OK (skipped=1) 2022-11-23T01:43:17.9105209Z 2022-11-23T01:43:17.9105335Z Generating XML reports... 2022-11-23T01:43:17.9105790Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013444.xml 2022-11-23T01:43:17.9106157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9106341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9106727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9106922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9106943Z 2022-11-23T01:43:17.9107054Z Running tests... 2022-11-23T01:43:17.9107324Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9107642Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9107929Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9108728Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78748 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.603s) 2022-11-23T01:43:17.9108753Z 2022-11-23T01:43:17.9109011Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9109129Z Ran 1 test in 1.603s 2022-11-23T01:43:17.9109149Z 2022-11-23T01:43:17.9109258Z OK (skipped=1) 2022-11-23T01:43:17.9109278Z 2022-11-23T01:43:17.9109403Z Generating XML reports... 2022-11-23T01:43:17.9109856Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013449.xml 2022-11-23T01:43:17.9110243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9110423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9110809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9111010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9111031Z 2022-11-23T01:43:17.9111125Z Running tests... 2022-11-23T01:43:17.9111393Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9111710Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9111987Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9112214Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13501 2022-11-23T01:43:17.9112440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13502 2022-11-23T01:43:17.9112816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9112995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9113368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9113565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9113949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9114129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9114515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9114767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9115225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9115494Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9115915Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9116302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9116538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9116769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9116872Z ok (4.307s) 2022-11-23T01:43:17.9116893Z 2022-11-23T01:43:17.9117165Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9117284Z Ran 1 test in 4.307s 2022-11-23T01:43:17.9117304Z 2022-11-23T01:43:17.9117399Z OK 2022-11-23T01:43:17.9117419Z 2022-11-23T01:43:17.9117546Z Generating XML reports... 2022-11-23T01:43:17.9117999Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013453.xml 2022-11-23T01:43:17.9118436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9118628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9119022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9119218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9119238Z 2022-11-23T01:43:17.9119349Z Running tests... 2022-11-23T01:43:17.9119617Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9119939Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9120207Z test_gather (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9120228Z 2022-11-23T01:43:17.9120491Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9120593Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9120613Z 2022-11-23T01:43:17.9120722Z OK (skipped=1) 2022-11-23T01:43:17.9120742Z 2022-11-23T01:43:17.9120868Z Generating XML reports... 2022-11-23T01:43:17.9121355Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013500.xml 2022-11-23T01:43:17.9121733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9121916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9122303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9122499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9122520Z 2022-11-23T01:43:17.9122614Z Running tests... 2022-11-23T01:43:17.9122880Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9123200Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9123474Z test_gather_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9123494Z 2022-11-23T01:43:17.9123758Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9123871Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9123892Z 2022-11-23T01:43:17.9124003Z OK (skipped=1) 2022-11-23T01:43:17.9124022Z 2022-11-23T01:43:17.9124146Z Generating XML reports... 2022-11-23T01:43:17.9124692Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013502.xml 2022-11-23T01:43:17.9125055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9125235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9125628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9125826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9125846Z 2022-11-23T01:43:17.9125956Z Running tests... 2022-11-23T01:43:17.9126224Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9126540Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9126799Z test_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T01:43:17.9126823Z 2022-11-23T01:43:17.9127089Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9127186Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9127206Z 2022-11-23T01:43:17.9127315Z OK (skipped=1) 2022-11-23T01:43:17.9127334Z 2022-11-23T01:43:17.9127458Z Generating XML reports... 2022-11-23T01:43:17.9127968Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013504.xml 2022-11-23T01:43:17.9128359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9128539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9128925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9129125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9129148Z 2022-11-23T01:43:17.9129241Z Running tests... 2022-11-23T01:43:17.9129507Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9129821Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9130098Z test_gather_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9130118Z 2022-11-23T01:43:17.9130386Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9130505Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9130525Z 2022-11-23T01:43:17.9130634Z OK (skipped=1) 2022-11-23T01:43:17.9130653Z 2022-11-23T01:43:17.9130780Z Generating XML reports... 2022-11-23T01:43:17.9131231Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013507.xml 2022-11-23T01:43:17.9131590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9131774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9132160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9132356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9132380Z 2022-11-23T01:43:17.9132490Z Running tests... 2022-11-23T01:43:17.9132756Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9133073Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9133346Z test_gather_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9133366Z 2022-11-23T01:43:17.9133631Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9133728Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9133798Z 2022-11-23T01:43:17.9133914Z OK (skipped=1) 2022-11-23T01:43:17.9133934Z 2022-11-23T01:43:17.9134058Z Generating XML reports... 2022-11-23T01:43:17.9134514Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013509.xml 2022-11-23T01:43:17.9134898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9135078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9135466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9135662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9135682Z 2022-11-23T01:43:17.9135775Z Running tests... 2022-11-23T01:43:17.9136042Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9136360Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9136630Z test_gather_object (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9136651Z 2022-11-23T01:43:17.9136917Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9137032Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9137234Z 2022-11-23T01:43:17.9137351Z OK (skipped=1) 2022-11-23T01:43:17.9137371Z 2022-11-23T01:43:17.9137496Z Generating XML reports... 2022-11-23T01:43:17.9137952Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013512.xml 2022-11-23T01:43:17.9138312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9138493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9138879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9139081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9139101Z 2022-11-23T01:43:17.9139210Z Running tests... 2022-11-23T01:43:17.9139477Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9139798Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9140080Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9140100Z 2022-11-23T01:43:17.9140411Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9140510Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9140529Z 2022-11-23T01:43:17.9140638Z OK (skipped=1) 2022-11-23T01:43:17.9140657Z 2022-11-23T01:43:17.9140783Z Generating XML reports... 2022-11-23T01:43:17.9141237Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013514.xml 2022-11-23T01:43:17.9141617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9141799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9142187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9142385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9142406Z 2022-11-23T01:43:17.9142516Z Running tests... 2022-11-23T01:43:17.9142767Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9143084Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9143338Z test_get_backend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9143629Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13835 2022-11-23T01:43:17.9143853Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13836 2022-11-23T01:43:17.9144234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9144418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9144808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9144988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9145361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9145539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9145922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9146120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9146374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9146624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9147081Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9147502Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9147718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9147962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9148191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9148435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9148837Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9149240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9149350Z ok (4.356s) 2022-11-23T01:43:17.9149371Z 2022-11-23T01:43:17.9149642Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9149740Z Ran 1 test in 4.356s 2022-11-23T01:43:17.9149776Z 2022-11-23T01:43:17.9149854Z OK 2022-11-23T01:43:17.9149873Z 2022-11-23T01:43:17.9150001Z Generating XML reports... 2022-11-23T01:43:17.9150453Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013517.xml 2022-11-23T01:43:17.9150838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9151017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9151407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9151609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9151629Z 2022-11-23T01:43:17.9151738Z Running tests... 2022-11-23T01:43:17.9151989Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9152306Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9152589Z test_get_future (__main__.TestDistBackendWithSpawn) ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:43:17.9152609Z 2022-11-23T01:43:17.9152877Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9153055Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9153075Z 2022-11-23T01:43:17.9153184Z OK (skipped=1) 2022-11-23T01:43:17.9153204Z 2022-11-23T01:43:17.9153328Z Generating XML reports... 2022-11-23T01:43:17.9153778Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013523.xml 2022-11-23T01:43:17.9154159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9154322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9154705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9154901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9154922Z 2022-11-23T01:43:17.9155234Z Running tests... 2022-11-23T01:43:17.9155520Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9155847Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9156098Z test_get_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9156321Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13971 2022-11-23T01:43:17.9156602Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13972 2022-11-23T01:43:17.9156999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9157180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9157569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9157767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9158151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9158329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9158714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9158913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9159146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9159394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9159889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9160293Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9160535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9160767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9160870Z ok (4.452s) 2022-11-23T01:43:17.9160892Z 2022-11-23T01:43:17.9161165Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9161265Z Ran 1 test in 4.452s 2022-11-23T01:43:17.9161307Z 2022-11-23T01:43:17.9161384Z OK 2022-11-23T01:43:17.9161404Z 2022-11-23T01:43:17.9161531Z Generating XML reports... 2022-11-23T01:43:17.9161981Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013526.xml 2022-11-23T01:43:17.9162359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9162537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9163001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9163198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9163219Z 2022-11-23T01:43:17.9163332Z Running tests... 2022-11-23T01:43:17.9163583Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9163903Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9164179Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9164402Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14074 2022-11-23T01:43:17.9164622Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14075 2022-11-23T01:43:17.9165003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9165189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9165577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9165771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9166177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9166365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9166754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9166948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9167196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9167443Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9167853Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9168256Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9168478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9168722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9168949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9169189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9169589Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9169993Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9170099Z ok (4.263s) 2022-11-23T01:43:17.9170120Z 2022-11-23T01:43:17.9170389Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9170504Z Ran 1 test in 4.263s 2022-11-23T01:43:17.9170524Z 2022-11-23T01:43:17.9170605Z OK 2022-11-23T01:43:17.9170625Z 2022-11-23T01:43:17.9170750Z Generating XML reports... 2022-11-23T01:43:17.9171201Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013533.xml 2022-11-23T01:43:17.9171581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9171761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9172149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9172411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9172432Z 2022-11-23T01:43:17.9172543Z Running tests... 2022-11-23T01:43:17.9172815Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9173120Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9173387Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9173611Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14177 2022-11-23T01:43:17.9173832Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14178 2022-11-23T01:43:17.9174213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9174393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9174784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9174982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9175343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9175572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9175967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9176162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9176411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9176658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9177072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9177480Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9177716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9177945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9178176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9178413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9178815Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9179216Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9179327Z ok (4.357s) 2022-11-23T01:43:17.9179348Z 2022-11-23T01:43:17.9179618Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9179733Z Ran 1 test in 4.357s 2022-11-23T01:43:17.9179754Z 2022-11-23T01:43:17.9179847Z OK 2022-11-23T01:43:17.9179867Z 2022-11-23T01:43:17.9179975Z Generating XML reports... 2022-11-23T01:43:17.9180429Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013540.xml 2022-11-23T01:43:17.9180807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9180988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9181375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9181570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9181643Z 2022-11-23T01:43:17.9181761Z Running tests... 2022-11-23T01:43:17.9182030Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9182329Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9182603Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9182828Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14280 2022-11-23T01:43:17.9183050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14281 2022-11-23T01:43:17.9183430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9183609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9183992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9184190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9184565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9184728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9185165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9185371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9185619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9185862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9186270Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9186679Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9186914Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9187144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9187392Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3kaghyrr 2022-11-23T01:43:17.9187668Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3kaghyrr/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9187928Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp_aal8wx 2022-11-23T01:43:17.9188202Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp_aal8wx/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9188492Z [1669167351.933187] [d8f8c46cdf70:14280:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9188739Z [1669167351.939470] [d8f8c46cdf70:14280:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9188988Z [1669167351.939470] [d8f8c46cdf70:14280:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9189273Z [1669167351.937924] [d8f8c46cdf70:14281:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9189504Z [1669167351.945091] [d8f8c46cdf70:14281:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9189754Z [1669167351.945091] [d8f8c46cdf70:14281:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9189859Z ok (6.075s) 2022-11-23T01:43:17.9189880Z 2022-11-23T01:43:17.9190228Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9190348Z Ran 1 test in 6.075s 2022-11-23T01:43:17.9190368Z 2022-11-23T01:43:17.9190445Z OK 2022-11-23T01:43:17.9190466Z 2022-11-23T01:43:17.9190590Z Generating XML reports... 2022-11-23T01:43:17.9191040Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013547.xml 2022-11-23T01:43:17.9191428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9191612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9192002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9192202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9192222Z 2022-11-23T01:43:17.9192334Z Running tests... 2022-11-23T01:43:17.9192607Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9192907Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9193152Z test_irecv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9193378Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14398 2022-11-23T01:43:17.9193649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14399 2022-11-23T01:43:17.9194036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9194219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9194608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9194799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9195373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9195562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9195959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9196160Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9196408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9196662Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9197075Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9197482Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9197722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9197934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9198221Z [1669167359.777408] [d8f8c46cdf70:14398:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9198463Z [1669167360.599716] [d8f8c46cdf70:14398:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9198710Z [1669167360.599716] [d8f8c46cdf70:14398:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9198992Z [1669167359.777381] [d8f8c46cdf70:14399:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9199232Z [1669167360.589735] [d8f8c46cdf70:14399:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9199566Z [1669167360.589735] [d8f8c46cdf70:14399:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9199671Z ok (5.694s) 2022-11-23T01:43:17.9199692Z 2022-11-23T01:43:17.9199965Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9200068Z Ran 1 test in 5.694s 2022-11-23T01:43:17.9200106Z 2022-11-23T01:43:17.9200183Z OK 2022-11-23T01:43:17.9200203Z 2022-11-23T01:43:17.9200328Z Generating XML reports... 2022-11-23T01:43:17.9200785Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013555.xml 2022-11-23T01:43:17.9201169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9201351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9201746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9201945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9201966Z 2022-11-23T01:43:17.9202076Z Running tests... 2022-11-23T01:43:17.9202332Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9202710Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9202968Z test_isend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9203189Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14508 2022-11-23T01:43:17.9203407Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14509 2022-11-23T01:43:17.9203787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9203971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9204358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9204534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9204916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9205096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9205483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9205679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9205928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9206179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9206592Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9206998Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9207217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9207449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9207731Z [1669167368.105596] [d8f8c46cdf70:14509:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9207970Z [1669167368.901414] [d8f8c46cdf70:14509:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9208216Z [1669167368.901414] [d8f8c46cdf70:14509:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9208552Z [1669167368.084282] [d8f8c46cdf70:14508:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9208793Z [1669167368.900115] [d8f8c46cdf70:14508:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9209038Z [1669167368.900115] [d8f8c46cdf70:14508:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9209145Z ok (5.575s) 2022-11-23T01:43:17.9209165Z 2022-11-23T01:43:17.9209439Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9209537Z Ran 1 test in 5.575s 2022-11-23T01:43:17.9209557Z 2022-11-23T01:43:17.9209652Z OK 2022-11-23T01:43:17.9209673Z 2022-11-23T01:43:17.9209799Z Generating XML reports... 2022-11-23T01:43:17.9210254Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013604.xml 2022-11-23T01:43:17.9210641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9210822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9211261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9211462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9211483Z 2022-11-23T01:43:17.9211575Z Running tests... 2022-11-23T01:43:17.9211843Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9212156Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9212433Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9212663Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14618 2022-11-23T01:43:17.9212887Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14619 2022-11-23T01:43:17.9213271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9213455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9213843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9214019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9214395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9214576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9214968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9215170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9215423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9215675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9216091Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9216500Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9216718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9217062Z STAGE:2022-11-23 01:36:16 14618:14618 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9217293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9217695Z STAGE:2022-11-23 01:36:16 14619:14619 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9217982Z [1669167376.252829] [d8f8c46cdf70:14618:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9218229Z [1669167377.311036] [d8f8c46cdf70:14618:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9218476Z [1669167377.311036] [d8f8c46cdf70:14618:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9218825Z STAGE:2022-11-23 01:36:17 14618:14618 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9219108Z [1669167376.255816] [d8f8c46cdf70:14619:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9219332Z [1669167377.306428] [d8f8c46cdf70:14619:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9219577Z [1669167377.306428] [d8f8c46cdf70:14619:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9219971Z STAGE:2022-11-23 01:36:17 14619:14619 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9220340Z STAGE:2022-11-23 01:36:17 14618:14618 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9220696Z STAGE:2022-11-23 01:36:17 14619:14619 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9220803Z ok (6.055s) 2022-11-23T01:43:17.9220824Z 2022-11-23T01:43:17.9221093Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9221210Z Ran 1 test in 6.055s 2022-11-23T01:43:17.9221230Z 2022-11-23T01:43:17.9221365Z OK 2022-11-23T01:43:17.9221386Z 2022-11-23T01:43:17.9221494Z Generating XML reports... 2022-11-23T01:43:17.9221954Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013612.xml 2022-11-23T01:43:17.9222338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9222526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9222918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9223118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9223139Z 2022-11-23T01:43:17.9223254Z Running tests... 2022-11-23T01:43:17.9223522Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9223824Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9224098Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9224324Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14732 2022-11-23T01:43:17.9224550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14733 2022-11-23T01:43:17.9224936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9225118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9225506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9225702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9226079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9226240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9226688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9226883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9227133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9227386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9227793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9228202Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9228442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9228788Z STAGE:2022-11-23 01:36:24 14733:14733 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9229007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9229351Z STAGE:2022-11-23 01:36:24 14732:14732 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9229690Z [1669167384.843399] [d8f8c46cdf70:14732:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9229938Z [1669167385.903119] [d8f8c46cdf70:14732:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9230185Z [1669167385.903119] [d8f8c46cdf70:14732:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9230538Z STAGE:2022-11-23 01:36:26 14732:14732 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9230822Z [1669167384.845236] [d8f8c46cdf70:14733:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9231063Z [1669167385.881020] [d8f8c46cdf70:14733:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9231311Z [1669167385.881020] [d8f8c46cdf70:14733:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9231657Z STAGE:2022-11-23 01:36:26 14733:14733 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9231997Z STAGE:2022-11-23 01:36:26 14732:14732 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9232353Z STAGE:2022-11-23 01:36:26 14733:14733 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9232460Z ok (5.970s) 2022-11-23T01:43:17.9232481Z 2022-11-23T01:43:17.9232752Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9232873Z Ran 1 test in 5.970s 2022-11-23T01:43:17.9232894Z 2022-11-23T01:43:17.9232989Z OK 2022-11-23T01:43:17.9233008Z 2022-11-23T01:43:17.9233135Z Generating XML reports... 2022-11-23T01:43:17.9233589Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013620.xml 2022-11-23T01:43:17.9233958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9234144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9234530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9234729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9234749Z 2022-11-23T01:43:17.9234859Z Running tests... 2022-11-23T01:43:17.9235417Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9235846Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9236136Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9236360Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14846 2022-11-23T01:43:17.9236570Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14847 2022-11-23T01:43:17.9236951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9237134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9237521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9237715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9238095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9238281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9238671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9238911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9239174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9239419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9241089Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9241495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9241740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9241986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9242212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9242453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9242856Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9243238Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9243488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.9243734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.9244132Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9244525Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9244765Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T01:43:17.9244873Z ok (21.141s) 2022-11-23T01:43:17.9244894Z 2022-11-23T01:43:17.9245167Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9245285Z Ran 1 test in 21.141s 2022-11-23T01:43:17.9245305Z 2022-11-23T01:43:17.9245382Z OK 2022-11-23T01:43:17.9245401Z 2022-11-23T01:43:17.9245528Z Generating XML reports... 2022-11-23T01:43:17.9245985Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013629.xml 2022-11-23T01:43:17.9246362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9246609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9247002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9247206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9247227Z 2022-11-23T01:43:17.9247340Z Running tests... 2022-11-23T01:43:17.9247592Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9247911Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9248216Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9248439Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14967 2022-11-23T01:43:17.9248665Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14968 2022-11-23T01:43:17.9249042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9249224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9249654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9249860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9250223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9250401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9250785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9250978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9251234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9251484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9251896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9252301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9252538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9252766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9252994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9253234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9253645Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9254047Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9254299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.9254544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.9254942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9255342Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9255564Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T01:43:17.9255869Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T01:43:17.9255974Z ok (21.177s) 2022-11-23T01:43:17.9255995Z 2022-11-23T01:43:17.9256267Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9256384Z Ran 1 test in 21.177s 2022-11-23T01:43:17.9256404Z 2022-11-23T01:43:17.9256504Z OK 2022-11-23T01:43:17.9256524Z 2022-11-23T01:43:17.9256652Z Generating XML reports... 2022-11-23T01:43:17.9257110Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013653.xml 2022-11-23T01:43:17.9257476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9257657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9258047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9258249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9258269Z 2022-11-23T01:43:17.9258377Z Running tests... 2022-11-23T01:43:17.9258647Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9259017Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9259445Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.9259467Z 2022-11-23T01:43:17.9259735Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9259831Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9259851Z 2022-11-23T01:43:17.9259962Z OK (skipped=1) 2022-11-23T01:43:17.9259981Z 2022-11-23T01:43:17.9260106Z Generating XML reports... 2022-11-23T01:43:17.9260560Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013716.xml 2022-11-23T01:43:17.9260944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9261127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9261522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9261720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9261740Z 2022-11-23T01:43:17.9261855Z Running tests... 2022-11-23T01:43:17.9262104Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9262423Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9262824Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.9262847Z 2022-11-23T01:43:17.9263116Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9263232Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9263252Z 2022-11-23T01:43:17.9263361Z OK (skipped=1) 2022-11-23T01:43:17.9263381Z 2022-11-23T01:43:17.9263507Z Generating XML reports... 2022-11-23T01:43:17.9263960Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013719.xml 2022-11-23T01:43:17.9264342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9264504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9264890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9265087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9265108Z 2022-11-23T01:43:17.9265273Z Running tests... 2022-11-23T01:43:17.9265543Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9265858Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9266285Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.9266310Z 2022-11-23T01:43:17.9266577Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9266674Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9266715Z 2022-11-23T01:43:17.9266806Z OK (skipped=1) 2022-11-23T01:43:17.9266826Z 2022-11-23T01:43:17.9266949Z Generating XML reports... 2022-11-23T01:43:17.9267400Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013721.xml 2022-11-23T01:43:17.9267779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9267964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9268351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9268548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9268568Z 2022-11-23T01:43:17.9268727Z Running tests... 2022-11-23T01:43:17.9268987Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9269300Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9269717Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.9269738Z 2022-11-23T01:43:17.9270005Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9270119Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9270143Z 2022-11-23T01:43:17.9270258Z OK (skipped=1) 2022-11-23T01:43:17.9270277Z 2022-11-23T01:43:17.9270404Z Generating XML reports... 2022-11-23T01:43:17.9270856Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013723.xml 2022-11-23T01:43:17.9271241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9271405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9271797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9271998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9272019Z 2022-11-23T01:43:17.9272133Z Running tests... 2022-11-23T01:43:17.9272400Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9272713Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9273137Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.9273158Z 2022-11-23T01:43:17.9273419Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9273536Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9273559Z 2022-11-23T01:43:17.9273652Z OK (skipped=1) 2022-11-23T01:43:17.9273672Z 2022-11-23T01:43:17.9273798Z Generating XML reports... 2022-11-23T01:43:17.9274289Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013726.xml 2022-11-23T01:43:17.9274668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9274845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9275453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9275748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9275771Z 2022-11-23T01:43:17.9275880Z Running tests... 2022-11-23T01:43:17.9276133Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9276453Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9276865Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T01:43:17.9276886Z 2022-11-23T01:43:17.9277153Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9277269Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9277289Z 2022-11-23T01:43:17.9277400Z OK (skipped=1) 2022-11-23T01:43:17.9277419Z 2022-11-23T01:43:17.9277546Z Generating XML reports... 2022-11-23T01:43:17.9278002Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013728.xml 2022-11-23T01:43:17.9278383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9278543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9278990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9279199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9279220Z 2022-11-23T01:43:17.9279331Z Running tests... 2022-11-23T01:43:17.9279599Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9279917Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9280325Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T01:43:17.9280351Z 2022-11-23T01:43:17.9280618Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9280733Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9280753Z 2022-11-23T01:43:17.9280846Z OK (skipped=1) 2022-11-23T01:43:17.9280884Z 2022-11-23T01:43:17.9280991Z Generating XML reports... 2022-11-23T01:43:17.9281446Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013731.xml 2022-11-23T01:43:17.9281826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9282002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9282386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9282581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9282605Z 2022-11-23T01:43:17.9282717Z Running tests... 2022-11-23T01:43:17.9282983Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9283283Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9283694Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T01:43:17.9283715Z 2022-11-23T01:43:17.9283981Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9284095Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9284115Z 2022-11-23T01:43:17.9284229Z OK (skipped=1) 2022-11-23T01:43:17.9284249Z 2022-11-23T01:43:17.9284379Z Generating XML reports... 2022-11-23T01:43:17.9284823Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013733.xml 2022-11-23T01:43:17.9285203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9285504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9285872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9297002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9297042Z 2022-11-23T01:43:17.9297195Z Running tests... 2022-11-23T01:43:17.9297512Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9297843Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9298239Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T01:43:17.9298259Z 2022-11-23T01:43:17.9298527Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9298648Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9298668Z 2022-11-23T01:43:17.9298776Z OK (skipped=1) 2022-11-23T01:43:17.9298795Z 2022-11-23T01:43:17.9298919Z Generating XML reports... 2022-11-23T01:43:17.9299380Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013735.xml 2022-11-23T01:43:17.9299870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9300067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9300444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9300643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9300664Z 2022-11-23T01:43:17.9300774Z Running tests... 2022-11-23T01:43:17.9301042Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9301366Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9301667Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL backend supports high priority stream (0.002s) 2022-11-23T01:43:17.9301687Z 2022-11-23T01:43:17.9301951Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9302071Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9302091Z 2022-11-23T01:43:17.9302202Z OK (skipped=1) 2022-11-23T01:43:17.9302222Z 2022-11-23T01:43:17.9302330Z Generating XML reports... 2022-11-23T01:43:17.9302788Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013738.xml 2022-11-23T01:43:17.9303168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9303348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9303743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9303939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9303959Z 2022-11-23T01:43:17.9304069Z Running tests... 2022-11-23T01:43:17.9304335Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9304653Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9304893Z test_new_subgroups (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:43:17.9304913Z 2022-11-23T01:43:17.9305177Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9305290Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9305310Z 2022-11-23T01:43:17.9305420Z OK (skipped=1) 2022-11-23T01:43:17.9305439Z 2022-11-23T01:43:17.9305563Z Generating XML reports... 2022-11-23T01:43:17.9306090Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013740.xml 2022-11-23T01:43:17.9306471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9306651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9307045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9307224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9307244Z 2022-11-23T01:43:17.9307352Z Running tests... 2022-11-23T01:43:17.9307618Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9307932Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9308207Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:43:17.9308232Z 2022-11-23T01:43:17.9308500Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9308614Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9308634Z 2022-11-23T01:43:17.9308742Z OK (skipped=1) 2022-11-23T01:43:17.9308762Z 2022-11-23T01:43:17.9308868Z Generating XML reports... 2022-11-23T01:43:17.9309376Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013743.xml 2022-11-23T01:43:17.9309768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9309947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9310332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9310527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9310552Z 2022-11-23T01:43:17.9310661Z Running tests... 2022-11-23T01:43:17.9310927Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9311246Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9311548Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:43:17.9311588Z 2022-11-23T01:43:17.9311834Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9311947Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9311967Z 2022-11-23T01:43:17.9312077Z OK (skipped=1) 2022-11-23T01:43:17.9312097Z 2022-11-23T01:43:17.9312220Z Generating XML reports... 2022-11-23T01:43:17.9312671Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013745.xml 2022-11-23T01:43:17.9313050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9313233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9313622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9313805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9313842Z 2022-11-23T01:43:17.9313934Z Running tests... 2022-11-23T01:43:17.9314199Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9314512Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9314817Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9315314Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15517 2022-11-23T01:43:17.9315665Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15518 2022-11-23T01:43:17.9316052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9316232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9316603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9316800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9317175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9317355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9317740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9317934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9318188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9318438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9318891Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9319311Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9319546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9319776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9319879Z ok (4.232s) 2022-11-23T01:43:17.9319900Z 2022-11-23T01:43:17.9320167Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9320288Z Ran 1 test in 4.232s 2022-11-23T01:43:17.9320308Z 2022-11-23T01:43:17.9320401Z OK 2022-11-23T01:43:17.9320421Z 2022-11-23T01:43:17.9320546Z Generating XML reports... 2022-11-23T01:43:17.9320985Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013747.xml 2022-11-23T01:43:17.9321407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9321589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9321973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9322169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9322189Z 2022-11-23T01:43:17.9322299Z Running tests... 2022-11-23T01:43:17.9322570Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9322892Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9323173Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9323395Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15620 2022-11-23T01:43:17.9323620Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15621 2022-11-23T01:43:17.9323997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9324174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9324549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9324726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9325109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9325367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9325736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9325936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9326187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9326431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9326833Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9327234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9327473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9327706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9327811Z ok (4.268s) 2022-11-23T01:43:17.9327831Z 2022-11-23T01:43:17.9328083Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9328251Z Ran 1 test in 4.268s 2022-11-23T01:43:17.9328274Z 2022-11-23T01:43:17.9328370Z OK 2022-11-23T01:43:17.9328390Z 2022-11-23T01:43:17.9328517Z Generating XML reports... 2022-11-23T01:43:17.9328970Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013754.xml 2022-11-23T01:43:17.9329343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9329522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9329907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9330091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9330127Z 2022-11-23T01:43:17.9330219Z Running tests... 2022-11-23T01:43:17.9330484Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9330800Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9331078Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:43:17.9331097Z 2022-11-23T01:43:17.9331357Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9331471Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9331491Z 2022-11-23T01:43:17.9331599Z OK (skipped=1) 2022-11-23T01:43:17.9331618Z 2022-11-23T01:43:17.9331743Z Generating XML reports... 2022-11-23T01:43:17.9332176Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013801.xml 2022-11-23T01:43:17.9332548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9332726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9333110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9333306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9333326Z 2022-11-23T01:43:17.9333434Z Running tests... 2022-11-23T01:43:17.9333696Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9334008Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9334312Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:43:17.9334401Z 2022-11-23T01:43:17.9334659Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9334774Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9334794Z 2022-11-23T01:43:17.9334904Z OK (skipped=1) 2022-11-23T01:43:17.9334923Z 2022-11-23T01:43:17.9335048Z Generating XML reports... 2022-11-23T01:43:17.9335504Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013803.xml 2022-11-23T01:43:17.9335879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9336056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9336443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9336622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9336664Z 2022-11-23T01:43:17.9336758Z Running tests... 2022-11-23T01:43:17.9337025Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9337340Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9337668Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9338438Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78112 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.660s) 2022-11-23T01:43:17.9338459Z 2022-11-23T01:43:17.9338728Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9338842Z Ran 1 test in 1.660s 2022-11-23T01:43:17.9338862Z 2022-11-23T01:43:17.9338977Z OK (skipped=1) 2022-11-23T01:43:17.9338996Z 2022-11-23T01:43:17.9339123Z Generating XML reports... 2022-11-23T01:43:17.9339560Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013806.xml 2022-11-23T01:43:17.9339940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9340121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9340507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9340701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9340721Z 2022-11-23T01:43:17.9340830Z Running tests... 2022-11-23T01:43:17.9341095Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9341410Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9341697Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9341903Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15823 2022-11-23T01:43:17.9342122Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15824 2022-11-23T01:43:17.9342499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9342678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9343060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9343252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9343622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9343864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9344226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9344417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9344670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9344919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9345321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9345724Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9345958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9346197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9346456Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp91z_qcv6 2022-11-23T01:43:17.9346710Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp91z_qcv6/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9347015Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp93l_fvhh 2022-11-23T01:43:17.9347295Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp93l_fvhh/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9347576Z [1669167495.220400] [d8f8c46cdf70:15823:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9347818Z [1669167495.227273] [d8f8c46cdf70:15823:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9348067Z [1669167495.227273] [d8f8c46cdf70:15823:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9348350Z [1669167495.229242] [d8f8c46cdf70:15824:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9348589Z [1669167495.235663] [d8f8c46cdf70:15824:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9348834Z [1669167495.235663] [d8f8c46cdf70:15824:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9348939Z ok (6.064s) 2022-11-23T01:43:17.9348959Z 2022-11-23T01:43:17.9349217Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9349334Z Ran 1 test in 6.064s 2022-11-23T01:43:17.9349354Z 2022-11-23T01:43:17.9349447Z OK 2022-11-23T01:43:17.9349466Z 2022-11-23T01:43:17.9349590Z Generating XML reports... 2022-11-23T01:43:17.9350049Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013810.xml 2022-11-23T01:43:17.9350431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9350609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9350998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9351179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9351216Z 2022-11-23T01:43:17.9351310Z Running tests... 2022-11-23T01:43:17.9351574Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9351889Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9352165Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9352453Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15941 2022-11-23T01:43:17.9352674Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15942 2022-11-23T01:43:17.9353052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9353234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9353604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9353800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9354175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9354353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9354738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9354935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9355416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9355747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9356157Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9356560Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9356794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9357024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9357312Z [1669167504.075183] [d8f8c46cdf70:15941:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9357592Z [1669167504.075609] [d8f8c46cdf70:15942:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9357833Z [1669167504.081606] [d8f8c46cdf70:15941:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9358079Z [1669167504.081606] [d8f8c46cdf70:15941:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9358309Z [1669167504.081619] [d8f8c46cdf70:15942:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9358547Z [1669167504.081619] [d8f8c46cdf70:15942:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9358639Z ok (5.835s) 2022-11-23T01:43:17.9358677Z 2022-11-23T01:43:17.9358932Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9359047Z Ran 1 test in 5.835s 2022-11-23T01:43:17.9359067Z 2022-11-23T01:43:17.9359160Z OK 2022-11-23T01:43:17.9359179Z 2022-11-23T01:43:17.9359304Z Generating XML reports... 2022-11-23T01:43:17.9359756Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013819.xml 2022-11-23T01:43:17.9360137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9360314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9360700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9360880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9360900Z 2022-11-23T01:43:17.9361082Z Running tests... 2022-11-23T01:43:17.9361356Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9361671Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9361965Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9362191Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16056 2022-11-23T01:43:17.9362412Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16057 2022-11-23T01:43:17.9362790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9362954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9363345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9363545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9363922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9364100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9364539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9364743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9364991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9365238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9365630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9366034Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9366275Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9366507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9366794Z [1669167512.499691] [d8f8c46cdf70:16056:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9367076Z [1669167512.506705] [d8f8c46cdf70:16057:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9367315Z [1669167512.505774] [d8f8c46cdf70:16056:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9367568Z [1669167512.505774] [d8f8c46cdf70:16056:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9367980Z [1669167512.513785] [d8f8c46cdf70:16057:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9368229Z [1669167512.513785] [d8f8c46cdf70:16057:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9368317Z ok (5.966s) 2022-11-23T01:43:17.9368339Z 2022-11-23T01:43:17.9368623Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9368741Z Ran 1 test in 5.966s 2022-11-23T01:43:17.9368760Z 2022-11-23T01:43:17.9368853Z OK 2022-11-23T01:43:17.9368872Z 2022-11-23T01:43:17.9368994Z Generating XML reports... 2022-11-23T01:43:17.9369443Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013827.xml 2022-11-23T01:43:17.9369824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9370280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9370672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9370869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9370890Z 2022-11-23T01:43:17.9371001Z Running tests... 2022-11-23T01:43:17.9371276Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9371591Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9371872Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9372630Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77123 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.639s) 2022-11-23T01:43:17.9372655Z 2022-11-23T01:43:17.9372923Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9373038Z Ran 1 test in 1.640s 2022-11-23T01:43:17.9373059Z 2022-11-23T01:43:17.9373170Z OK (skipped=1) 2022-11-23T01:43:17.9373189Z 2022-11-23T01:43:17.9373297Z Generating XML reports... 2022-11-23T01:43:17.9373804Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013835.xml 2022-11-23T01:43:17.9374197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9374374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9374758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9374954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9374978Z 2022-11-23T01:43:17.9375089Z Running tests... 2022-11-23T01:43:17.9375353Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9375654Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9375959Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9376708Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77292 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.665s) 2022-11-23T01:43:17.9376729Z 2022-11-23T01:43:17.9376998Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9377115Z Ran 1 test in 1.665s 2022-11-23T01:43:17.9377138Z 2022-11-23T01:43:17.9377248Z OK (skipped=1) 2022-11-23T01:43:17.9377268Z 2022-11-23T01:43:17.9377392Z Generating XML reports... 2022-11-23T01:43:17.9377843Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013840.xml 2022-11-23T01:43:17.9378226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9378406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9378775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9378971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9378991Z 2022-11-23T01:43:17.9379099Z Running tests... 2022-11-23T01:43:17.9379363Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9379679Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9380061Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9380285Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16239 2022-11-23T01:43:17.9380511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16240 2022-11-23T01:43:17.9380890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9381053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9381436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9381630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9382004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9382187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9382571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9382763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9383058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9383298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9383706Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9384110Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9384342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9384580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9384733Z skip: Need at least 4 CUDA devices (4.221s) 2022-11-23T01:43:17.9384754Z 2022-11-23T01:43:17.9385026Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9385142Z Ran 1 test in 4.221s 2022-11-23T01:43:17.9385165Z 2022-11-23T01:43:17.9385275Z OK (skipped=1) 2022-11-23T01:43:17.9385295Z 2022-11-23T01:43:17.9385403Z Generating XML reports... 2022-11-23T01:43:17.9385856Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013844.xml 2022-11-23T01:43:17.9386233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9386414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9386796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9386996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9387015Z 2022-11-23T01:43:17.9387124Z Running tests... 2022-11-23T01:43:17.9387393Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9387712Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9388028Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9388250Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16342 2022-11-23T01:43:17.9388469Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16343 2022-11-23T01:43:17.9388847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9389087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9389479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9389672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9390047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9390209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9390593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9390785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9391032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9391279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9391688Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9392096Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9392382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9392624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9392759Z skip: Need at least 4 CUDA devices (4.266s) 2022-11-23T01:43:17.9392796Z 2022-11-23T01:43:17.9393050Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9393162Z Ran 1 test in 4.267s 2022-11-23T01:43:17.9393182Z 2022-11-23T01:43:17.9393290Z OK (skipped=1) 2022-11-23T01:43:17.9393309Z 2022-11-23T01:43:17.9393439Z Generating XML reports... 2022-11-23T01:43:17.9393890Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013851.xml 2022-11-23T01:43:17.9394265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9394446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9394835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9395014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9395197Z 2022-11-23T01:43:17.9395322Z Running tests... 2022-11-23T01:43:17.9395594Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9395910Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9396201Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9396956Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/84886 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.608s) 2022-11-23T01:43:17.9396981Z 2022-11-23T01:43:17.9397247Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9397363Z Ran 1 test in 1.608s 2022-11-23T01:43:17.9397383Z 2022-11-23T01:43:17.9397492Z OK (skipped=1) 2022-11-23T01:43:17.9397511Z 2022-11-23T01:43:17.9397635Z Generating XML reports... 2022-11-23T01:43:17.9398068Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013857.xml 2022-11-23T01:43:17.9398441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9398728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9399116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9399313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9399333Z 2022-11-23T01:43:17.9399445Z Running tests... 2022-11-23T01:43:17.9399714Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9400027Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9400278Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9400502Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16479 2022-11-23T01:43:17.9400720Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16480 2022-11-23T01:43:17.9401102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9401281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9401663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9401919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9402307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9402485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9402851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9403048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9403297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9403548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9403955Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9404362Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9404595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9404836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9405045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9405284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9405693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9406091Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9406433Z STAGE:2022-11-23 01:39:05 16479:16479 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9406768Z STAGE:2022-11-23 01:39:05 16480:16480 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9407056Z [1669167545.972268] [d8f8c46cdf70:16480:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9407295Z [1669167547.008898] [d8f8c46cdf70:16480:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9407541Z [1669167547.008898] [d8f8c46cdf70:16480:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9407879Z [1669167545.970522] [d8f8c46cdf70:16479:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9408099Z [1669167547.021062] [d8f8c46cdf70:16479:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9408348Z [1669167547.021062] [d8f8c46cdf70:16479:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9408914Z STAGE:2022-11-23 01:39:07 16480:16480 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:39:07 16479:16479 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9408935Z 2022-11-23T01:43:17.9409292Z STAGE:2022-11-23 01:39:07 16480:16480 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9409645Z STAGE:2022-11-23 01:39:07 16479:16479 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9409980Z STAGE:2022-11-23 01:39:07 16480:16480 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9410310Z STAGE:2022-11-23 01:39:07 16479:16479 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9410699Z STAGE:2022-11-23 01:39:07 16480:16480 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9411047Z STAGE:2022-11-23 01:39:07 16479:16479 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9411395Z STAGE:2022-11-23 01:39:07 16480:16480 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9411724Z STAGE:2022-11-23 01:39:07 16479:16479 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9411828Z ok (5.851s) 2022-11-23T01:43:17.9411848Z 2022-11-23T01:43:17.9412114Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9412231Z Ran 1 test in 5.851s 2022-11-23T01:43:17.9412251Z 2022-11-23T01:43:17.9412344Z OK 2022-11-23T01:43:17.9412363Z 2022-11-23T01:43:17.9412486Z Generating XML reports... 2022-11-23T01:43:17.9412941Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013902.xml 2022-11-23T01:43:17.9413326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9413509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9413880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9414076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9414097Z 2022-11-23T01:43:17.9414206Z Running tests... 2022-11-23T01:43:17.9414472Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9414793Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9415060Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9415283Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16593 2022-11-23T01:43:17.9415506Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16594 2022-11-23T01:43:17.9415872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9416051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9416439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9416637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9417010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9417250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9417641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9417837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9418091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9418323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9418727Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9419129Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9419361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9419608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9419833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9420116Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9420527Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9420922Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9421248Z STAGE:2022-11-23 01:39:14 16594:16594 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9421622Z STAGE:2022-11-23 01:39:14 16593:16593 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9421916Z [1669167554.345948] [d8f8c46cdf70:16593:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9422158Z [1669167555.405360] [d8f8c46cdf70:16593:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9422407Z [1669167555.405360] [d8f8c46cdf70:16593:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9422758Z STAGE:2022-11-23 01:39:15 16593:16593 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9423042Z [1669167554.369391] [d8f8c46cdf70:16594:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9423280Z [1669167555.386904] [d8f8c46cdf70:16594:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9423525Z [1669167555.386904] [d8f8c46cdf70:16594:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9423878Z STAGE:2022-11-23 01:39:15 16594:16594 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9424444Z STAGE:2022-11-23 01:39:15 16594:16594 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:39:15 16593:16593 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9424483Z 2022-11-23T01:43:17.9424801Z STAGE:2022-11-23 01:39:15 16594:16594 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9425130Z STAGE:2022-11-23 01:39:15 16593:16593 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9425466Z STAGE:2022-11-23 01:39:15 16594:16594 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9425815Z STAGE:2022-11-23 01:39:15 16594:16594 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9426217Z STAGE:2022-11-23 01:39:15 16593:16593 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9426565Z STAGE:2022-11-23 01:39:15 16593:16593 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9426671Z ok (5.929s) 2022-11-23T01:43:17.9426692Z 2022-11-23T01:43:17.9426967Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9427065Z Ran 1 test in 5.929s 2022-11-23T01:43:17.9427101Z 2022-11-23T01:43:17.9427178Z OK 2022-11-23T01:43:17.9427197Z 2022-11-23T01:43:17.9427322Z Generating XML reports... 2022-11-23T01:43:17.9427774Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013910.xml 2022-11-23T01:43:17.9428149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9428328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9428720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9428915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9428936Z 2022-11-23T01:43:17.9429048Z Running tests... 2022-11-23T01:43:17.9429350Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9429678Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9429955Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9430177Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16707 2022-11-23T01:43:17.9430398Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16708 2022-11-23T01:43:17.9430774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9430958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9431346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9431522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9431902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9432082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9432469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9432662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9432910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9433161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9433566Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9433969Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9434190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9434432Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9434655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9434894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9435542Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9435983Z STAGE:2022-11-23 01:39:22 16707:16707 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9436380Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9436721Z STAGE:2022-11-23 01:39:22 16708:16708 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9437004Z [1669167562.878947] [d8f8c46cdf70:16708:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9437244Z [1669167563.903360] [d8f8c46cdf70:16708:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9437476Z [1669167563.903360] [d8f8c46cdf70:16708:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9437753Z [1669167562.856822] [d8f8c46cdf70:16707:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9437994Z [1669167563.901162] [d8f8c46cdf70:16707:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9438360Z [1669167563.901162] [d8f8c46cdf70:16707:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9438938Z STAGE:2022-11-23 01:39:24 16708:16708 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:39:24 16707:16707 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9438959Z 2022-11-23T01:43:17.9439539Z STAGE:2022-11-23 01:39:24 16708:16708 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:39:24 16707:16707 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9439560Z 2022-11-23T01:43:17.9439897Z STAGE:2022-11-23 01:39:24 16708:16708 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9440268Z STAGE:2022-11-23 01:39:24 16707:16707 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9440618Z STAGE:2022-11-23 01:39:24 16708:16708 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9440955Z STAGE:2022-11-23 01:39:24 16707:16707 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9441307Z STAGE:2022-11-23 01:39:24 16708:16708 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9441643Z STAGE:2022-11-23 01:39:24 16707:16707 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9441747Z ok (5.870s) 2022-11-23T01:43:17.9441769Z 2022-11-23T01:43:17.9442035Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9442149Z Ran 1 test in 5.870s 2022-11-23T01:43:17.9442169Z 2022-11-23T01:43:17.9442268Z OK 2022-11-23T01:43:17.9442287Z 2022-11-23T01:43:17.9442411Z Generating XML reports... 2022-11-23T01:43:17.9442866Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013918.xml 2022-11-23T01:43:17.9443250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9443416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9443802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9443997Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9444017Z 2022-11-23T01:43:17.9444126Z Running tests... 2022-11-23T01:43:17.9444393Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9444710Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9445041Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9445265Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16821 2022-11-23T01:43:17.9445484Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16822 2022-11-23T01:43:17.9445848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9446032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9446418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9446613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9446989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9447171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9447558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9447752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9448030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9448291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9448701Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9449104Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9449336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9449586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9449812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9450054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9450458Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9450839Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9451184Z STAGE:2022-11-23 01:39:31 16822:16822 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9451513Z STAGE:2022-11-23 01:39:31 16821:16821 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9451796Z [1669167571.314283] [d8f8c46cdf70:16822:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9452041Z [1669167572.328240] [d8f8c46cdf70:16822:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9452286Z [1669167572.328240] [d8f8c46cdf70:16822:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9452567Z [1669167571.293391] [d8f8c46cdf70:16821:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9452804Z [1669167572.326494] [d8f8c46cdf70:16821:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9453046Z [1669167572.326494] [d8f8c46cdf70:16821:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9453607Z STAGE:2022-11-23 01:39:32 16822:16822 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:39:32 16821:16821 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9453686Z 2022-11-23T01:43:17.9454054Z STAGE:2022-11-23 01:39:32 16821:16821 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9454390Z STAGE:2022-11-23 01:39:32 16822:16822 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9454729Z STAGE:2022-11-23 01:39:32 16822:16822 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9455055Z STAGE:2022-11-23 01:39:32 16821:16821 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9455392Z STAGE:2022-11-23 01:39:32 16822:16822 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9455725Z STAGE:2022-11-23 01:39:32 16821:16821 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9456072Z STAGE:2022-11-23 01:39:32 16822:16822 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9456431Z STAGE:2022-11-23 01:39:32 16821:16821 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9456536Z ok (5.867s) 2022-11-23T01:43:17.9456556Z 2022-11-23T01:43:17.9456826Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9456924Z Ran 1 test in 5.867s 2022-11-23T01:43:17.9456993Z 2022-11-23T01:43:17.9457093Z OK 2022-11-23T01:43:17.9457114Z 2022-11-23T01:43:17.9457242Z Generating XML reports... 2022-11-23T01:43:17.9457698Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013927.xml 2022-11-23T01:43:17.9458075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9458253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9458640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9458840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9458860Z 2022-11-23T01:43:17.9458972Z Running tests... 2022-11-23T01:43:17.9459223Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9459546Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9459807Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9460028Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16935 2022-11-23T01:43:17.9460247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16936 2022-11-23T01:43:17.9460625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9460804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9461193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9461370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9461749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9461927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9462313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9462505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9462754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9463002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9463486Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9463890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9464113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9464345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9464508Z skip: Skipped due to small world size. (4.255s) 2022-11-23T01:43:17.9464529Z 2022-11-23T01:43:17.9464796Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9464910Z Ran 1 test in 4.256s 2022-11-23T01:43:17.9464930Z 2022-11-23T01:43:17.9465040Z OK (skipped=1) 2022-11-23T01:43:17.9465060Z 2022-11-23T01:43:17.9465183Z Generating XML reports... 2022-11-23T01:43:17.9465641Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013935.xml 2022-11-23T01:43:17.9466008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9466187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9466624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9466828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9466848Z 2022-11-23T01:43:17.9466958Z Running tests... 2022-11-23T01:43:17.9467224Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9467540Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9467800Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9468025Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17038 2022-11-23T01:43:17.9468231Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17039 2022-11-23T01:43:17.9468606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9468787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9469172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9469365Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9469740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9469916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9470301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9470481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9470728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9470974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9471381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9471784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9472016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9472246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9472409Z skip: Skipped due to small world size. (4.244s) 2022-11-23T01:43:17.9472482Z 2022-11-23T01:43:17.9472765Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9472862Z Ran 1 test in 4.244s 2022-11-23T01:43:17.9472900Z 2022-11-23T01:43:17.9472992Z OK (skipped=1) 2022-11-23T01:43:17.9473011Z 2022-11-23T01:43:17.9473137Z Generating XML reports... 2022-11-23T01:43:17.9473602Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013942.xml 2022-11-23T01:43:17.9473983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9474161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9474545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9474739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9474762Z 2022-11-23T01:43:17.9474872Z Running tests... 2022-11-23T01:43:17.9475313Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9475641Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9475908Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9476207Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17141 2022-11-23T01:43:17.9476445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17142 2022-11-23T01:43:17.9476828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9477007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9477391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9477571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9477947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9478126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9478509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9478705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9478951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9479197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9479604Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9480010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9480227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9480458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9480622Z skip: Skipped due to small world size. (4.250s) 2022-11-23T01:43:17.9480644Z 2022-11-23T01:43:17.9480913Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9481030Z Ran 1 test in 4.250s 2022-11-23T01:43:17.9481050Z 2022-11-23T01:43:17.9481160Z OK (skipped=1) 2022-11-23T01:43:17.9481180Z 2022-11-23T01:43:17.9481304Z Generating XML reports... 2022-11-23T01:43:17.9481757Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013949.xml 2022-11-23T01:43:17.9482132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9482372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9482762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9482956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9482976Z 2022-11-23T01:43:17.9483091Z Running tests... 2022-11-23T01:43:17.9483356Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9483673Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9483933Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9484153Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17244 2022-11-23T01:43:17.9484355Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17245 2022-11-23T01:43:17.9484737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9484917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9485299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9485538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9485926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9486103Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9486485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9486678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9486913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9487160Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9487566Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9487972Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9488205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9488433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9488594Z skip: Skipped due to small world size. (4.224s) 2022-11-23T01:43:17.9488614Z 2022-11-23T01:43:17.9488883Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9488981Z Ran 1 test in 4.224s 2022-11-23T01:43:17.9489021Z 2022-11-23T01:43:17.9489112Z OK (skipped=1) 2022-11-23T01:43:17.9489131Z 2022-11-23T01:43:17.9489256Z Generating XML reports... 2022-11-23T01:43:17.9489708Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013956.xml 2022-11-23T01:43:17.9490086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9490264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9490650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9490846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9490866Z 2022-11-23T01:43:17.9490976Z Running tests... 2022-11-23T01:43:17.9491224Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9491604Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9491854Z test_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9492074Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17347 2022-11-23T01:43:17.9492295Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17348 2022-11-23T01:43:17.9492676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9492856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9493242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9493438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9493795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9493980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9494364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9494555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9494848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9495105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9495510Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9495913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9496130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9496478Z STAGE:2022-11-23 01:40:06 17348:17348 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9496711Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9497051Z STAGE:2022-11-23 01:40:06 17347:17347 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9497339Z [1669167606.851890] [d8f8c46cdf70:17348:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9497580Z [1669167607.894436] [d8f8c46cdf70:17348:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9497827Z [1669167607.894436] [d8f8c46cdf70:17348:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9498106Z [1669167606.829373] [d8f8c46cdf70:17347:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9498345Z [1669167607.884236] [d8f8c46cdf70:17347:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9498587Z [1669167607.884236] [d8f8c46cdf70:17347:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9499137Z STAGE:2022-11-23 01:40:08 17348:17348 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:40:08 17347:17347 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9499175Z 2022-11-23T01:43:17.9499517Z STAGE:2022-11-23 01:40:08 17348:17348 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9499872Z STAGE:2022-11-23 01:40:08 17347:17347 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9500203Z STAGE:2022-11-23 01:40:08 17348:17348 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9500594Z STAGE:2022-11-23 01:40:08 17347:17347 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9500934Z STAGE:2022-11-23 01:40:08 17348:17348 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9501272Z STAGE:2022-11-23 01:40:08 17347:17347 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9501621Z STAGE:2022-11-23 01:40:08 17348:17348 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9501969Z STAGE:2022-11-23 01:40:08 17347:17347 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9502076Z ok (5.931s) 2022-11-23T01:43:17.9502096Z 2022-11-23T01:43:17.9502346Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9502460Z Ran 1 test in 5.932s 2022-11-23T01:43:17.9502480Z 2022-11-23T01:43:17.9502573Z OK 2022-11-23T01:43:17.9502596Z 2022-11-23T01:43:17.9502722Z Generating XML reports... 2022-11-23T01:43:17.9503176Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014002.xml 2022-11-23T01:43:17.9503558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9503789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9504188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9504367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9504404Z 2022-11-23T01:43:17.9504497Z Running tests... 2022-11-23T01:43:17.9504769Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9505085Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9505341Z test_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9505564Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17461 2022-11-23T01:43:17.9505785Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17462 2022-11-23T01:43:17.9506169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9506348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9506716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9506911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9507282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9507459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9507848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9508041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9508290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9508538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9508928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9509332Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9509565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9509907Z STAGE:2022-11-23 01:40:15 17462:17462 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9510198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9510541Z STAGE:2022-11-23 01:40:15 17461:17461 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9510829Z [1669167615.332555] [d8f8c46cdf70:17462:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9511071Z [1669167616.382007] [d8f8c46cdf70:17462:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9511318Z [1669167616.382007] [d8f8c46cdf70:17462:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9511596Z [1669167615.325777] [d8f8c46cdf70:17461:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9511820Z [1669167616.406574] [d8f8c46cdf70:17461:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9512062Z [1669167616.406574] [d8f8c46cdf70:17461:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9512672Z STAGE:2022-11-23 01:40:16 17462:17462 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:40:16 17461:17461 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9512695Z 2022-11-23T01:43:17.9513059Z STAGE:2022-11-23 01:40:16 17462:17462 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9513412Z STAGE:2022-11-23 01:40:16 17461:17461 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9513743Z STAGE:2022-11-23 01:40:16 17462:17462 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9514072Z STAGE:2022-11-23 01:40:16 17461:17461 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9514416Z STAGE:2022-11-23 01:40:16 17462:17462 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9514749Z STAGE:2022-11-23 01:40:16 17461:17461 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9515259Z STAGE:2022-11-23 01:40:16 17462:17462 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9515602Z STAGE:2022-11-23 01:40:16 17461:17461 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9515708Z ok (5.869s) 2022-11-23T01:43:17.9515729Z 2022-11-23T01:43:17.9515995Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9516109Z Ran 1 test in 5.869s 2022-11-23T01:43:17.9516129Z 2022-11-23T01:43:17.9516222Z OK 2022-11-23T01:43:17.9516241Z 2022-11-23T01:43:17.9516367Z Generating XML reports... 2022-11-23T01:43:17.9516822Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014011.xml 2022-11-23T01:43:17.9517206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9517387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9517761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9517961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9517981Z 2022-11-23T01:43:17.9518091Z Running tests... 2022-11-23T01:43:17.9518361Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9518676Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9518959Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports reduce multigpu (0.002s) 2022-11-23T01:43:17.9519058Z 2022-11-23T01:43:17.9519329Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9519442Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9519462Z 2022-11-23T01:43:17.9519553Z OK (skipped=1) 2022-11-23T01:43:17.9519589Z 2022-11-23T01:43:17.9519695Z Generating XML reports... 2022-11-23T01:43:17.9520151Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014019.xml 2022-11-23T01:43:17.9520528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9520707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9521091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9521285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9521310Z 2022-11-23T01:43:17.9521456Z Running tests... 2022-11-23T01:43:17.9521734Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9522030Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9522290Z test_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9522574Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17608 2022-11-23T01:43:17.9522808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17609 2022-11-23T01:43:17.9523188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9523369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9523754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9523956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9524314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9524490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9524878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9525071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9525318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9525566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9525972Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9526377Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9526614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9526939Z STAGE:2022-11-23 01:40:26 17609:17609 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9527174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9527517Z STAGE:2022-11-23 01:40:26 17608:17608 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9527802Z [1669167626.161632] [d8f8c46cdf70:17609:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9528039Z [1669167627.189073] [d8f8c46cdf70:17609:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9528285Z [1669167627.189073] [d8f8c46cdf70:17609:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9528635Z [1669167626.140205] [d8f8c46cdf70:17608:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9528873Z [1669167627.185364] [d8f8c46cdf70:17608:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9529120Z [1669167627.185364] [d8f8c46cdf70:17608:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9529686Z STAGE:2022-11-23 01:40:27 17609:17609 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:40:27 17608:17608 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9529707Z 2022-11-23T01:43:17.9530066Z STAGE:2022-11-23 01:40:27 17608:17608 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9530405Z STAGE:2022-11-23 01:40:27 17609:17609 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9530742Z STAGE:2022-11-23 01:40:27 17609:17609 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9531072Z STAGE:2022-11-23 01:40:27 17608:17608 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9531455Z STAGE:2022-11-23 01:40:27 17609:17609 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9531800Z STAGE:2022-11-23 01:40:27 17608:17608 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9532149Z STAGE:2022-11-23 01:40:27 17609:17609 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9532496Z STAGE:2022-11-23 01:40:27 17608:17608 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9532600Z ok (5.868s) 2022-11-23T01:43:17.9532620Z 2022-11-23T01:43:17.9532887Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9532989Z Ran 1 test in 5.869s 2022-11-23T01:43:17.9533008Z 2022-11-23T01:43:17.9533102Z OK 2022-11-23T01:43:17.9533121Z 2022-11-23T01:43:17.9533250Z Generating XML reports... 2022-11-23T01:43:17.9533704Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014022.xml 2022-11-23T01:43:17.9534087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9534273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9534662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9534858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9534878Z 2022-11-23T01:43:17.9534971Z Running tests... 2022-11-23T01:43:17.9535236Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9535555Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9535856Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce_scatter_tensor (0.002s) 2022-11-23T01:43:17.9535876Z 2022-11-23T01:43:17.9536144Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9536259Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9536279Z 2022-11-23T01:43:17.9536389Z OK (skipped=1) 2022-11-23T01:43:17.9536408Z 2022-11-23T01:43:17.9536532Z Generating XML reports... 2022-11-23T01:43:17.9536988Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014030.xml 2022-11-23T01:43:17.9537349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9537531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9537994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9538189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9538208Z 2022-11-23T01:43:17.9538318Z Running tests... 2022-11-23T01:43:17.9538587Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9538904Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9539181Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports reduce_scatter_v (0.003s) 2022-11-23T01:43:17.9539201Z 2022-11-23T01:43:17.9539465Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9539562Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9539582Z 2022-11-23T01:43:17.9539691Z OK (skipped=1) 2022-11-23T01:43:17.9539711Z 2022-11-23T01:43:17.9539841Z Generating XML reports... 2022-11-23T01:43:17.9540292Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014032.xml 2022-11-23T01:43:17.9540670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9540848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9541289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9541496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9541516Z 2022-11-23T01:43:17.9541625Z Running tests... 2022-11-23T01:43:17.9541876Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9542192Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9542442Z test_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9542669Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17788 2022-11-23T01:43:17.9542890Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17789 2022-11-23T01:43:17.9543269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9543447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9543837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9544017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9544392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9544573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9544961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9545154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9545403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9545657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9546065Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9546468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9546685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9547029Z STAGE:2022-11-23 01:40:39 17789:17789 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9547325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9547665Z STAGE:2022-11-23 01:40:39 17788:17788 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9547952Z [1669167639.330965] [d8f8c46cdf70:17789:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9548194Z [1669167640.358239] [d8f8c46cdf70:17789:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9548439Z [1669167640.358239] [d8f8c46cdf70:17789:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9548718Z [1669167639.309071] [d8f8c46cdf70:17788:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9548954Z [1669167640.361210] [d8f8c46cdf70:17788:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9549187Z [1669167640.361210] [d8f8c46cdf70:17788:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9549803Z STAGE:2022-11-23 01:40:40 17789:17789 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:40:40 17788:17788 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9549847Z 2022-11-23T01:43:17.9550197Z STAGE:2022-11-23 01:40:40 17789:17789 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9550548Z STAGE:2022-11-23 01:40:40 17788:17788 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9550881Z STAGE:2022-11-23 01:40:40 17788:17788 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9551207Z STAGE:2022-11-23 01:40:40 17789:17789 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9551551Z STAGE:2022-11-23 01:40:40 17788:17788 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9551901Z STAGE:2022-11-23 01:40:40 17788:17788 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9552246Z STAGE:2022-11-23 01:40:40 17789:17789 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9552593Z STAGE:2022-11-23 01:40:40 17789:17789 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9552680Z ok (5.841s) 2022-11-23T01:43:17.9552717Z 2022-11-23T01:43:17.9552968Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9553081Z Ran 1 test in 5.841s 2022-11-23T01:43:17.9553101Z 2022-11-23T01:43:17.9553194Z OK 2022-11-23T01:43:17.9553213Z 2022-11-23T01:43:17.9553337Z Generating XML reports... 2022-11-23T01:43:17.9553790Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014035.xml 2022-11-23T01:43:17.9554171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9554349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9554740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9554919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9554938Z 2022-11-23T01:43:17.9555209Z Running tests... 2022-11-23T01:43:17.9555490Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9555808Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9556071Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T01:43:17.9556171Z 2022-11-23T01:43:17.9556442Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9556558Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9556578Z 2022-11-23T01:43:17.9556688Z OK (skipped=1) 2022-11-23T01:43:17.9556707Z 2022-11-23T01:43:17.9556814Z Generating XML reports... 2022-11-23T01:43:17.9557272Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014043.xml 2022-11-23T01:43:17.9557652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9557832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9558252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9558449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9558469Z 2022-11-23T01:43:17.9558577Z Running tests... 2022-11-23T01:43:17.9558849Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9559322Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9559581Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T01:43:17.9559620Z 2022-11-23T01:43:17.9559944Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9560070Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9560090Z 2022-11-23T01:43:17.9560200Z OK (skipped=1) 2022-11-23T01:43:17.9560218Z 2022-11-23T01:43:17.9560342Z Generating XML reports... 2022-11-23T01:43:17.9560798Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014046.xml 2022-11-23T01:43:17.9561177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9561364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9561756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9561936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9561956Z 2022-11-23T01:43:17.9562065Z Running tests... 2022-11-23T01:43:17.9562334Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9562656Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9562915Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9563231Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17968 2022-11-23T01:43:17.9563455Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17969 2022-11-23T01:43:17.9563838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9564005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9564390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9564589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9564961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9565136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9565515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9565708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9565955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9566269Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9566667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9567073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9567306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9567649Z STAGE:2022-11-23 01:40:52 17969:17969 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9567879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9568220Z STAGE:2022-11-23 01:40:52 17968:17968 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9568503Z [1669167652.627898] [d8f8c46cdf70:17969:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9568749Z [1669167653.652782] [d8f8c46cdf70:17969:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9569041Z [1669167653.652782] [d8f8c46cdf70:17969:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9569383Z STAGE:2022-11-23 01:40:54 17969:17969 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9569662Z [1669167652.606488] [d8f8c46cdf70:17968:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9569899Z [1669167653.658348] [d8f8c46cdf70:17968:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9570142Z [1669167653.658348] [d8f8c46cdf70:17968:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9570493Z STAGE:2022-11-23 01:40:54 17968:17968 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9570852Z STAGE:2022-11-23 01:40:54 17969:17969 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9571208Z STAGE:2022-11-23 01:40:54 17968:17968 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9571543Z STAGE:2022-11-23 01:40:54 17968:17968 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9571872Z STAGE:2022-11-23 01:40:54 17969:17969 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9572208Z STAGE:2022-11-23 01:40:54 17968:17968 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9572540Z STAGE:2022-11-23 01:40:54 17968:17968 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9572880Z STAGE:2022-11-23 01:40:54 17969:17969 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9573233Z STAGE:2022-11-23 01:40:54 17969:17969 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9573339Z ok (5.983s) 2022-11-23T01:43:17.9573359Z 2022-11-23T01:43:17.9573625Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9573742Z Ran 1 test in 5.983s 2022-11-23T01:43:17.9573762Z 2022-11-23T01:43:17.9573857Z OK 2022-11-23T01:43:17.9573876Z 2022-11-23T01:43:17.9574003Z Generating XML reports... 2022-11-23T01:43:17.9574438Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014048.xml 2022-11-23T01:43:17.9574817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9574997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9575445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9575643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9575662Z 2022-11-23T01:43:17.9575771Z Running tests... 2022-11-23T01:43:17.9576038Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9576358Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9576627Z test_scatter (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9576647Z 2022-11-23T01:43:17.9576891Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9577004Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9577024Z 2022-11-23T01:43:17.9577132Z OK (skipped=1) 2022-11-23T01:43:17.9577151Z 2022-11-23T01:43:17.9577275Z Generating XML reports... 2022-11-23T01:43:17.9577726Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014057.xml 2022-11-23T01:43:17.9578100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9578279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9578712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9578921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9578941Z 2022-11-23T01:43:17.9579033Z Running tests... 2022-11-23T01:43:17.9579297Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9579610Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9579879Z test_scatter_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9579904Z 2022-11-23T01:43:17.9580169Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9580283Z Ran 1 test in 0.003s 2022-11-23T01:43:17.9580303Z 2022-11-23T01:43:17.9580412Z OK (skipped=1) 2022-11-23T01:43:17.9580431Z 2022-11-23T01:43:17.9580555Z Generating XML reports... 2022-11-23T01:43:17.9580990Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014059.xml 2022-11-23T01:43:17.9581368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9581547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9581932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9582126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9582149Z 2022-11-23T01:43:17.9582261Z Running tests... 2022-11-23T01:43:17.9582526Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9582843Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9583118Z test_scatter_complex (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9583142Z 2022-11-23T01:43:17.9583391Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9583507Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9583528Z 2022-11-23T01:43:17.9583636Z OK (skipped=1) 2022-11-23T01:43:17.9583655Z 2022-11-23T01:43:17.9583779Z Generating XML reports... 2022-11-23T01:43:17.9584227Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014101.xml 2022-11-23T01:43:17.9584598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9584894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9585281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9585476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9585496Z 2022-11-23T01:43:17.9585593Z Running tests... 2022-11-23T01:43:17.9585856Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9586171Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9586427Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T01:43:17.9586447Z 2022-11-23T01:43:17.9586709Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9586820Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9586844Z 2022-11-23T01:43:17.9586953Z OK (skipped=1) 2022-11-23T01:43:17.9586972Z 2022-11-23T01:43:17.9587095Z Generating XML reports... 2022-11-23T01:43:17.9587546Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014104.xml 2022-11-23T01:43:17.9587954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9588144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9588532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9588726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9588745Z 2022-11-23T01:43:17.9588854Z Running tests... 2022-11-23T01:43:17.9589119Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9589432Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9589708Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T01:43:17.9589728Z 2022-11-23T01:43:17.9589977Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9590091Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9590110Z 2022-11-23T01:43:17.9590223Z OK (skipped=1) 2022-11-23T01:43:17.9590242Z 2022-11-23T01:43:17.9590366Z Generating XML reports... 2022-11-23T01:43:17.9590811Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014106.xml 2022-11-23T01:43:17.9591187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9591367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9591749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9591948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9591968Z 2022-11-23T01:43:17.9592060Z Running tests... 2022-11-23T01:43:17.9592324Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9592643Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9592919Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9592939Z 2022-11-23T01:43:17.9593204Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9593317Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9593337Z 2022-11-23T01:43:17.9593445Z OK (skipped=1) 2022-11-23T01:43:17.9593464Z 2022-11-23T01:43:17.9593588Z Generating XML reports... 2022-11-23T01:43:17.9594034Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014109.xml 2022-11-23T01:43:17.9594469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9594648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9595197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9595403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9595425Z 2022-11-23T01:43:17.9595534Z Running tests... 2022-11-23T01:43:17.9595806Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9596117Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9596387Z test_scatter_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:43:17.9596411Z 2022-11-23T01:43:17.9596675Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9596774Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9596794Z 2022-11-23T01:43:17.9596903Z OK (skipped=1) 2022-11-23T01:43:17.9596922Z 2022-11-23T01:43:17.9597046Z Generating XML reports... 2022-11-23T01:43:17.9597570Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014111.xml 2022-11-23T01:43:17.9597967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9598143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9598525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9598719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9598739Z 2022-11-23T01:43:17.9598832Z Running tests... 2022-11-23T01:43:17.9599102Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9599418Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9599808Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:43:17.9599828Z 2022-11-23T01:43:17.9600093Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9600206Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9600226Z 2022-11-23T01:43:17.9600333Z OK (skipped=1) 2022-11-23T01:43:17.9600352Z 2022-11-23T01:43:17.9600477Z Generating XML reports... 2022-11-23T01:43:17.9600923Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014113.xml 2022-11-23T01:43:17.9601285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9601469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9601852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9602046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9602066Z 2022-11-23T01:43:17.9602177Z Running tests... 2022-11-23T01:43:17.9602444Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9602759Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9603008Z test_send_recv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9603216Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18346 2022-11-23T01:43:17.9603437Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18347 2022-11-23T01:43:17.9603815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9604070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9604460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9604658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9605034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9605213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9605600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9605778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9606027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9606279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9606686Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9607131Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9607374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9607603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9607885Z [1669167680.230950] [d8f8c46cdf70:18347:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9608126Z [1669167680.991540] [d8f8c46cdf70:18347:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9608359Z [1669167680.991540] [d8f8c46cdf70:18347:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9608639Z [1669167680.210447] [d8f8c46cdf70:18346:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9608878Z [1669167680.995546] [d8f8c46cdf70:18346:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9609121Z [1669167680.995546] [d8f8c46cdf70:18346:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9609226Z ok (5.462s) 2022-11-23T01:43:17.9609247Z 2022-11-23T01:43:17.9609519Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9609633Z Ran 1 test in 5.462s 2022-11-23T01:43:17.9609653Z 2022-11-23T01:43:17.9609745Z OK 2022-11-23T01:43:17.9609764Z 2022-11-23T01:43:17.9609895Z Generating XML reports... 2022-11-23T01:43:17.9610329Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014116.xml 2022-11-23T01:43:17.9610707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9610889Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9611273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9611469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9611489Z 2022-11-23T01:43:17.9611598Z Running tests... 2022-11-23T01:43:17.9611864Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9612178Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9612467Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T01:43:17.9612540Z 2022-11-23T01:43:17.9612800Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9612917Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9612936Z 2022-11-23T01:43:17.9613046Z OK (skipped=1) 2022-11-23T01:43:17.9613066Z 2022-11-23T01:43:17.9613192Z Generating XML reports... 2022-11-23T01:43:17.9613646Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014124.xml 2022-11-23T01:43:17.9614023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9614204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9614586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9614784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9614805Z 2022-11-23T01:43:17.9614897Z Running tests... 2022-11-23T01:43:17.9615160Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9615474Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9615831Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T01:43:17.9615853Z 2022-11-23T01:43:17.9616126Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9616242Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9616262Z 2022-11-23T01:43:17.9616371Z OK (skipped=1) 2022-11-23T01:43:17.9616390Z 2022-11-23T01:43:17.9616514Z Generating XML reports... 2022-11-23T01:43:17.9616967Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014126.xml 2022-11-23T01:43:17.9617334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9617514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9617899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9618096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9618115Z 2022-11-23T01:43:17.9618225Z Running tests... 2022-11-23T01:43:17.9618487Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9618799Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9619104Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T01:43:17.9619125Z 2022-11-23T01:43:17.9619396Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9619493Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9619513Z 2022-11-23T01:43:17.9619621Z OK (skipped=1) 2022-11-23T01:43:17.9619640Z 2022-11-23T01:43:17.9619763Z Generating XML reports... 2022-11-23T01:43:17.9620217Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014129.xml 2022-11-23T01:43:17.9620595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9620773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9621157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9621350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9621406Z 2022-11-23T01:43:17.9621501Z Running tests... 2022-11-23T01:43:17.9621833Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9622151Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9622430Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9622655Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18555 2022-11-23T01:43:17.9622879Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18556 2022-11-23T01:43:17.9623254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9623434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9623820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9623998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9624378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9624557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9624943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9625182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9625439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9625687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9626100Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9626485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9626725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9627071Z STAGE:2022-11-23 01:41:35 18555:18555 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9627303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9627644Z STAGE:2022-11-23 01:41:35 18556:18556 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9627928Z [1669167695.486158] [d8f8c46cdf70:18556:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9628169Z [1669167696.519954] [d8f8c46cdf70:18556:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9628415Z [1669167696.519954] [d8f8c46cdf70:18556:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9628768Z STAGE:2022-11-23 01:41:36 18556:18556 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9629128Z STAGE:2022-11-23 01:41:36 18556:18556 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9629396Z [1669167695.464169] [d8f8c46cdf70:18555:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9629634Z [1669167696.507848] [d8f8c46cdf70:18555:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9629878Z [1669167696.507848] [d8f8c46cdf70:18555:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9630221Z STAGE:2022-11-23 01:41:36 18555:18555 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9630577Z STAGE:2022-11-23 01:41:36 18555:18555 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9630747Z ok (5.964s) 2022-11-23T01:43:17.9630768Z 2022-11-23T01:43:17.9631041Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9631157Z Ran 1 test in 5.964s 2022-11-23T01:43:17.9631177Z 2022-11-23T01:43:17.9631272Z OK 2022-11-23T01:43:17.9631291Z 2022-11-23T01:43:17.9631400Z Generating XML reports... 2022-11-23T01:43:17.9631859Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014131.xml 2022-11-23T01:43:17.9632239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9632418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9632806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9633002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9633025Z 2022-11-23T01:43:17.9633136Z Running tests... 2022-11-23T01:43:17.9633404Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9633704Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9633996Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T01:43:17.9634019Z 2022-11-23T01:43:17.9634298Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9634413Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9634432Z 2022-11-23T01:43:17.9634542Z OK (skipped=1) 2022-11-23T01:43:17.9634561Z 2022-11-23T01:43:17.9634687Z Generating XML reports... 2022-11-23T01:43:17.9635355Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014140.xml 2022-11-23T01:43:17.9635746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9635935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9636305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9636498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9636524Z 2022-11-23T01:43:17.9636636Z Running tests... 2022-11-23T01:43:17.9636903Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9637222Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9637492Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T01:43:17.9637513Z 2022-11-23T01:43:17.9637781Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9637901Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9637921Z 2022-11-23T01:43:17.9638029Z OK (skipped=1) 2022-11-23T01:43:17.9638049Z 2022-11-23T01:43:17.9638157Z Generating XML reports... 2022-11-23T01:43:17.9638611Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014142.xml 2022-11-23T01:43:17.9638992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9639174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9639557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9639749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9639769Z 2022-11-23T01:43:17.9639878Z Running tests... 2022-11-23T01:43:17.9640143Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9640547Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9640795Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T01:43:17.9640815Z 2022-11-23T01:43:17.9641078Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9641191Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9641215Z 2022-11-23T01:43:17.9641327Z OK (skipped=1) 2022-11-23T01:43:17.9641347Z 2022-11-23T01:43:17.9641470Z Generating XML reports... 2022-11-23T01:43:17.9641921Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014144.xml 2022-11-23T01:43:17.9642297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9642476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9642845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9643045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9643064Z 2022-11-23T01:43:17.9643174Z Running tests... 2022-11-23T01:43:17.9643439Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9643811Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9644096Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9644318Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18768 2022-11-23T01:43:17.9644537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18769 2022-11-23T01:43:17.9644915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9645082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9645466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9645661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9646041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9646222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9646606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9646798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9647048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9647282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9647693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9648096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9648334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9648677Z STAGE:2022-11-23 01:41:51 18769:18769 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9648908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9649240Z STAGE:2022-11-23 01:41:51 18768:18768 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9649526Z [1669167711.189964] [d8f8c46cdf70:18769:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9649826Z [1669167712.244044] [d8f8c46cdf70:18769:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9650072Z [1669167712.244044] [d8f8c46cdf70:18769:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9650411Z STAGE:2022-11-23 01:41:52 18769:18769 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9650772Z STAGE:2022-11-23 01:41:52 18769:18769 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9651055Z [1669167711.187399] [d8f8c46cdf70:18768:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9651293Z [1669167712.236697] [d8f8c46cdf70:18768:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9651535Z [1669167712.236697] [d8f8c46cdf70:18768:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9651885Z STAGE:2022-11-23 01:41:52 18768:18768 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9652241Z STAGE:2022-11-23 01:41:52 18768:18768 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9652347Z ok (5.970s) 2022-11-23T01:43:17.9652368Z 2022-11-23T01:43:17.9652682Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9652788Z Ran 1 test in 5.970s 2022-11-23T01:43:17.9652808Z 2022-11-23T01:43:17.9652903Z OK 2022-11-23T01:43:17.9652923Z 2022-11-23T01:43:17.9653046Z Generating XML reports... 2022-11-23T01:43:17.9653500Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014147.xml 2022-11-23T01:43:17.9653876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9654063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9654452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9654649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9654669Z 2022-11-23T01:43:17.9654778Z Running tests... 2022-11-23T01:43:17.9655031Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9655344Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9655604Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9655827Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18882 2022-11-23T01:43:17.9656047Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18883 2022-11-23T01:43:17.9656424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9656608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9656996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9657177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9657550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9657728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9658106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9658300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9658549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9658867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9659277Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9659687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9659905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9660134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9660416Z [1669167719.743849] [d8f8c46cdf70:18883:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9660655Z [1669167720.559488] [d8f8c46cdf70:18883:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9660905Z [1669167720.559488] [d8f8c46cdf70:18883:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9661182Z [1669167719.741292] [d8f8c46cdf70:18882:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9661464Z [1669167720.542722] [d8f8c46cdf70:18882:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9661719Z [1669167720.542722] [d8f8c46cdf70:18882:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9661825Z ok (5.563s) 2022-11-23T01:43:17.9661845Z 2022-11-23T01:43:17.9662115Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9662215Z Ran 1 test in 5.564s 2022-11-23T01:43:17.9662235Z 2022-11-23T01:43:17.9662332Z OK 2022-11-23T01:43:17.9662351Z 2022-11-23T01:43:17.9662482Z Generating XML reports... 2022-11-23T01:43:17.9662938Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014155.xml 2022-11-23T01:43:17.9663322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9663503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9663889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9664084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9664105Z 2022-11-23T01:43:17.9664197Z Running tests... 2022-11-23T01:43:17.9664467Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9664781Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9665071Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9665297Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18992 2022-11-23T01:43:17.9665518Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18993 2022-11-23T01:43:17.9665898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9666078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9666466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9666645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9667018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9667195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9667638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9667831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9668077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9668328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9668732Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9669117Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9669350Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9669694Z STAGE:2022-11-23 01:42:07 18992:18992 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9669929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9670273Z STAGE:2022-11-23 01:42:07 18993:18993 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9670599Z [1669167727.908022] [d8f8c46cdf70:18992:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9670847Z [1669167728.942285] [d8f8c46cdf70:18992:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9671094Z [1669167728.942285] [d8f8c46cdf70:18992:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9671441Z STAGE:2022-11-23 01:42:09 18992:18992 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9671722Z [1669167727.929816] [d8f8c46cdf70:18993:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9671946Z [1669167728.959633] [d8f8c46cdf70:18993:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9672192Z [1669167728.959633] [d8f8c46cdf70:18993:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9672541Z STAGE:2022-11-23 01:42:09 18993:18993 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9672896Z STAGE:2022-11-23 01:42:09 18992:18992 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9673248Z STAGE:2022-11-23 01:42:09 18993:18993 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9673354Z ok (5.968s) 2022-11-23T01:43:17.9673374Z 2022-11-23T01:43:17.9673641Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9673762Z Ran 1 test in 5.968s 2022-11-23T01:43:17.9673782Z 2022-11-23T01:43:17.9673876Z OK 2022-11-23T01:43:17.9673895Z 2022-11-23T01:43:17.9674004Z Generating XML reports... 2022-11-23T01:43:17.9674454Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014203.xml 2022-11-23T01:43:17.9674839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9675222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9675631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9675829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9675851Z 2022-11-23T01:43:17.9675960Z Running tests... 2022-11-23T01:43:17.9676230Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9676614Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9676901Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9677125Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19106 2022-11-23T01:43:17.9677352Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19107 2022-11-23T01:43:17.9677727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9677907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9678292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9678487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9678860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9679026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9679410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9679604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9679910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9680169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9680576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9680977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9681215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9681558Z STAGE:2022-11-23 01:42:16 19107:19107 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9681773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9682116Z STAGE:2022-11-23 01:42:16 19106:19106 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:43:17.9682399Z [1669167736.347077] [d8f8c46cdf70:19106:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9682640Z [1669167737.410429] [d8f8c46cdf70:19106:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9682890Z [1669167737.410429] [d8f8c46cdf70:19106:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9683236Z STAGE:2022-11-23 01:42:17 19106:19106 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9683521Z [1669167736.369853] [d8f8c46cdf70:19107:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9683758Z [1669167737.416681] [d8f8c46cdf70:19107:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9684006Z [1669167737.416681] [d8f8c46cdf70:19107:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9684355Z STAGE:2022-11-23 01:42:17 19107:19107 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:43:17.9684694Z STAGE:2022-11-23 01:42:17 19106:19106 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9685048Z STAGE:2022-11-23 01:42:17 19107:19107 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:43:17.9685212Z ok (6.017s) 2022-11-23T01:43:17.9685233Z 2022-11-23T01:43:17.9685505Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9685621Z Ran 1 test in 6.017s 2022-11-23T01:43:17.9685641Z 2022-11-23T01:43:17.9685736Z OK 2022-11-23T01:43:17.9685756Z 2022-11-23T01:43:17.9685881Z Generating XML reports... 2022-11-23T01:43:17.9686336Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014212.xml 2022-11-23T01:43:17.9686700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9686882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9687268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9687460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9687480Z 2022-11-23T01:43:17.9687593Z Running tests... 2022-11-23T01:43:17.9687865Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9688182Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9688468Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T01:43:17.9688540Z 2022-11-23T01:43:17.9688815Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9688912Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9688932Z 2022-11-23T01:43:17.9689042Z OK (skipped=1) 2022-11-23T01:43:17.9689061Z 2022-11-23T01:43:17.9689186Z Generating XML reports... 2022-11-23T01:43:17.9689636Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014221.xml 2022-11-23T01:43:17.9690011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9690194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9690583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9690779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9690799Z 2022-11-23T01:43:17.9690913Z Running tests... 2022-11-23T01:43:17.9691164Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9691479Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9691771Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T01:43:17.9691792Z 2022-11-23T01:43:17.9692054Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9692165Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9692188Z 2022-11-23T01:43:17.9692296Z OK (skipped=1) 2022-11-23T01:43:17.9692316Z 2022-11-23T01:43:17.9692444Z Generating XML reports... 2022-11-23T01:43:17.9692894Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014223.xml 2022-11-23T01:43:17.9693276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9693438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9693821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9694014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9694034Z 2022-11-23T01:43:17.9694142Z Running tests... 2022-11-23T01:43:17.9694410Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9694727Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9695059Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9695285Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19286 2022-11-23T01:43:17.9695489Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19287 2022-11-23T01:43:17.9695870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9696050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9696434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9696628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9697007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9697191Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9697577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9697771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9698052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9698313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9698724Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9699124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9699356Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9699592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9699876Z [1669167750.537198] [d8f8c46cdf70:19286:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9700118Z [1669167750.542827] [d8f8c46cdf70:19286:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9700363Z [1669167750.542827] [d8f8c46cdf70:19286:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9700626Z [1669167750.537275] [d8f8c46cdf70:19287:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9700860Z [1669167750.542677] [d8f8c46cdf70:19287:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9701106Z [1669167750.542677] [d8f8c46cdf70:19287:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9701210Z ok (6.032s) 2022-11-23T01:43:17.9701230Z 2022-11-23T01:43:17.9701505Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9701620Z Ran 1 test in 6.033s 2022-11-23T01:43:17.9701640Z 2022-11-23T01:43:17.9701732Z OK 2022-11-23T01:43:17.9701756Z 2022-11-23T01:43:17.9701882Z Generating XML reports... 2022-11-23T01:43:17.9702336Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014225.xml 2022-11-23T01:43:17.9702698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9702879Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9703262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9703520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9703541Z 2022-11-23T01:43:17.9703653Z Running tests... 2022-11-23T01:43:17.9703925Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9704240Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9704514Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9704724Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19404 2022-11-23T01:43:17.9704946Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19405 2022-11-23T01:43:17.9705323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9705500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9705886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9706081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9706455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9706683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9707080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9707257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9707505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9707752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9708157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9708567Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9708800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9709065Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5afsd72v 2022-11-23T01:43:17.9709337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5afsd72v/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9709568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9709808Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9efh0z67 2022-11-23T01:43:17.9710078Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9efh0z67/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9710363Z [1669167758.398290] [d8f8c46cdf70:19404:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9710604Z [1669167759.176906] [d8f8c46cdf70:19404:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9710853Z [1669167759.176906] [d8f8c46cdf70:19404:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9711133Z [1669167758.419164] [d8f8c46cdf70:19405:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9711367Z [1669167759.184291] [d8f8c46cdf70:19405:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9711609Z [1669167759.184291] [d8f8c46cdf70:19405:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9711772Z ok (5.458s) 2022-11-23T01:43:17.9711792Z 2022-11-23T01:43:17.9712067Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9712166Z Ran 1 test in 5.458s 2022-11-23T01:43:17.9712185Z 2022-11-23T01:43:17.9712278Z OK 2022-11-23T01:43:17.9712297Z 2022-11-23T01:43:17.9712421Z Generating XML reports... 2022-11-23T01:43:17.9712878Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014234.xml 2022-11-23T01:43:17.9713260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9713439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9713827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9714023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9714046Z 2022-11-23T01:43:17.9714139Z Running tests... 2022-11-23T01:43:17.9714408Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9714720Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9715233Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl & Gloo backend support DistributedDataParallel (0.002s) 2022-11-23T01:43:17.9715325Z 2022-11-23T01:43:17.9715620Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9715738Z Ran 1 test in 0.002s 2022-11-23T01:43:17.9715758Z 2022-11-23T01:43:17.9715869Z OK (skipped=1) 2022-11-23T01:43:17.9715888Z 2022-11-23T01:43:17.9716016Z Generating XML reports... 2022-11-23T01:43:17.9716470Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014242.xml 2022-11-23T01:43:17.9716830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9717015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9717402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9717595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9717615Z 2022-11-23T01:43:17.9717728Z Running tests... 2022-11-23T01:43:17.9717995Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9718309Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9718603Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9718809Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19551 2022-11-23T01:43:17.9719029Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19552 2022-11-23T01:43:17.9719410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9719589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9719972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9720170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9720543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9720721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9721107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9721286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9721670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9721920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9722333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9722740Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9722975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9723206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9723464Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl7yz526x 2022-11-23T01:43:17.9723738Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl7yz526x/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9723984Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpccqecldb 2022-11-23T01:43:17.9724255Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpccqecldb/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9724639Z [1669167769.475684] [d8f8c46cdf70:19551:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9724893Z [1669167769.480987] [d8f8c46cdf70:19551:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9725140Z [1669167769.480987] [d8f8c46cdf70:19551:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9725919Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.9726212Z [1669167769.478637] [d8f8c46cdf70:19552:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9726448Z [1669167769.484091] [d8f8c46cdf70:19552:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9726692Z [1669167769.484091] [d8f8c46cdf70:19552:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9727467Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:43:17.9727578Z ok (5.927s) 2022-11-23T01:43:17.9727599Z 2022-11-23T01:43:17.9727880Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9727998Z Ran 1 test in 5.927s 2022-11-23T01:43:17.9728018Z 2022-11-23T01:43:17.9728093Z OK 2022-11-23T01:43:17.9728112Z 2022-11-23T01:43:17.9728238Z Generating XML reports... 2022-11-23T01:43:17.9728692Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014244.xml 2022-11-23T01:43:17.9729073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9729310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9729700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9729896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9729916Z 2022-11-23T01:43:17.9730026Z Running tests... 2022-11-23T01:43:17.9730293Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9730593Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9730877Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9731099Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19669 2022-11-23T01:43:17.9731319Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19670 2022-11-23T01:43:17.9731704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9731883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9732265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9732506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9732876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9733055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9733439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9733632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9733877Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9734130Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9734538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9734948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9735185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9735401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9735646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9735887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9736292Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9736695Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9736939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.9737182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.9737577Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9737974Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9738219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6vlqek6m 2022-11-23T01:43:17.9738498Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6vlqek6m/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9738817Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt21i5h_3 2022-11-23T01:43:17.9739090Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt21i5h_3/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9739374Z [1669167778.042702] [d8f8c46cdf70:19669:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9739613Z [1669167778.049138] [d8f8c46cdf70:19669:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9739862Z [1669167778.049138] [d8f8c46cdf70:19669:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9740182Z [1669167783.436157] [d8f8c46cdf70:19669:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x5621839ee000, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T01:43:17.9740515Z [1669167783.472808] [d8f8c46cdf70:19669:0] mpool.c:55 UCX WARN object 0x5621839ff540 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T01:43:17.9740843Z [1669167778.050885] [d8f8c46cdf70:19670:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9741090Z [1669167778.057116] [d8f8c46cdf70:19670:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9741315Z [1669167778.057116] [d8f8c46cdf70:19670:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9741723Z [1669167783.482807] [d8f8c46cdf70:19670:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x558faede72c0 was not matched 2022-11-23T01:43:17.9741828Z ok (10.638s) 2022-11-23T01:43:17.9741849Z 2022-11-23T01:43:17.9742124Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9742241Z Ran 1 test in 10.638s 2022-11-23T01:43:17.9742261Z 2022-11-23T01:43:17.9742354Z OK 2022-11-23T01:43:17.9742373Z 2022-11-23T01:43:17.9742498Z Generating XML reports... 2022-11-23T01:43:17.9742957Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014253.xml 2022-11-23T01:43:17.9743338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9743501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9743887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9744082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9744103Z 2022-11-23T01:43:17.9744212Z Running tests... 2022-11-23T01:43:17.9744486Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9744801Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:43:17.9745094Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:43:17.9745323Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19789 2022-11-23T01:43:17.9745527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19790 2022-11-23T01:43:17.9745905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9746083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9746470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9746663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9747095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:43:17.9747274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:43:17.9747656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:43:17.9747855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:43:17.9748087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:43:17.9748335Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:43:17.9748743Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9749147Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:43:17.9749382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:43:17.9749613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:43:17.9749903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:43:17.9750159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:43:17.9750556Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9750941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:43:17.9751188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:43:17.9751436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:43:17.9751833Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9752231Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:43:17.9752496Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp22xl8kkn 2022-11-23T01:43:17.9752769Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp22xl8kkn/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9753024Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7_l9g1mz 2022-11-23T01:43:17.9753291Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7_l9g1mz/_remote_module_non_scriptable.py 2022-11-23T01:43:17.9753558Z [1669167791.265263] [d8f8c46cdf70:19790:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9753805Z [1669167791.270396] [d8f8c46cdf70:19790:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9754051Z [1669167791.270396] [d8f8c46cdf70:19790:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9754459Z [1669167796.651262] [d8f8c46cdf70:19790:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x55e8c008fc00 was not matched 2022-11-23T01:43:17.9754740Z [1669167791.263187] [d8f8c46cdf70:19789:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:43:17.9754975Z [1669167791.269050] [d8f8c46cdf70:19789:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:43:17.9755430Z [1669167791.269050] [d8f8c46cdf70:19789:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:43:17.9755844Z [1669167796.614445] [d8f8c46cdf70:19789:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x55e26768bb80, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T01:43:17.9756134Z [1669167796.661329] [d8f8c46cdf70:19789:0] mpool.c:55 UCX WARN object 0x55e26779d0c0 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T01:43:17.9756241Z ok (10.533s) 2022-11-23T01:43:17.9756262Z 2022-11-23T01:43:17.9756542Z ---------------------------------------------------------------------- 2022-11-23T01:43:17.9756642Z Ran 1 test in 10.533s 2022-11-23T01:43:17.9756661Z 2022-11-23T01:43:17.9756755Z OK 2022-11-23T01:43:17.9756774Z 2022-11-23T01:43:17.9756900Z Generating XML reports... 2022-11-23T01:43:17.9757355Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014306.xml 2022-11-23T01:43:17.9757379Z 2022-11-23T01:43:17.9757831Z ##[endgroup] 2022-11-23T01:43:17.9758305Z FINISHED PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_y2_41v57) 2022-11-23T01:43:17.9758326Z 2022-11-23T01:43:17.9758540Z Running distributed tests for the ucc backend with file init_method in shard 3 of 3 2022-11-23T01:43:17.9759111Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 01:43:17.738440] 2022-11-23T02:07:36.8568356Z 2022-11-23T02:07:36.8569336Z Expand the folded group to see the log file of distributed/test_distributed_spawn 2022-11-23T02:07:36.8570514Z ##[group]PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_mqf8_nzl) 2022-11-23T02:07:36.8572502Z 2022-11-23T02:07:36.8628222Z , <__main__.TestDistBackendWithSpawn testMethod=test_3_level_hierarchical_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_Backend_enum_class>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_2D_Input>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Channels_Last>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_No_Affine>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_non_default_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_with_amp_and_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedSampler_padding>, <__main__.TestDistBackendWithSpawn testMethod=test_SyncBatchNorm_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_with_then_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_simple>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_with_empty>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_cat_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_stack_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_default_pg>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max_complex_unsupported>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_complex_unsupported_ops>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_result_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_average_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_global>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_group>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo_tags>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_mixed_backend_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_no_rank_zero_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_list_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_ring_exchange_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_self_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_tensor_err>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_without_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer_via_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce_return_future>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_comm_hook_logging>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_different_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_same_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_create_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_device>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_forward_backward_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_grad_div_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_post_localSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_pickling_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_ignore_params_arg>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_inference>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_join_model_equivalence>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_gpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_num_params_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_shape_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_err_ignore_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_error>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_namedtuple>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_python_error_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_returns_tensor_with_no_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_shared_grad_acc_unused_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_static_graph_nested_types>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_bn_training_vs_eval>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_module_states>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_join_disable>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs_stop_iteration_sync_bn>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_unused_params_rebuild_buckets_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_zero_output_features>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_group>, <__main__.TestDistBackendWithSpawn testMethod=test_detect_ddp_is_actually_static>, <__main__.TestDistBackendWithSpawn testMethod=test_different_graph_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_dump_DDP_relevant_env_vars>, <__main__.TestDistBackendWithSpawn testMethod=test_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_get_backend>, <__main__.TestDistBackendWithSpawn testMethod=test_get_future>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_group>, <__main__.TestDistBackendWithSpawn testMethod=test_invalid_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_irecv>, <__main__.TestDistBackendWithSpawn testMethod=test_isend>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_failure_order>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_rank_0_timeout>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allgather>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_reduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_high_priority_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_input_rank_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_negative_input_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_group_size_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_overlap_not_allowed>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_world_size_not_divisible_by_group_size>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_dict_module>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_tuple_module>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager_param_group>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_step_reload>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_stateless_api_with_ddp>, <__main__.TestDistBackendWithSpawn testMethod=test_static_graph_api_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_sync_bn_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_undefined_grad_parity_unused_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_without_logger>]> 2022-11-23T02:07:36.8672524Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8674673Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8675511Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8675951Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8676518Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8677029Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8677541Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8678064Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8678571Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8679140Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8679726Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8680278Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8680849Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8681390Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8681893Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8682375Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8682856Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8683333Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8683795Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8684276Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8684752Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8685242Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8685663Z test_all_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8686054Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8686487Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8686925Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8687379Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8687802Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8688261Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8688749Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8689158Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8689552Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8689963Z test_all_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8690437Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8690876Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8691285Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8691708Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8692149Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8692564Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8692972Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8693457Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8693888Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8694357Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8694885Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8695353Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8695768Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8696218Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8696665Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8697069Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8697526Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8697984Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8698413Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8698820Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8699260Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8699738Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8700134Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8700566Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8700990Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8701406Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8701788Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8702208Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8702619Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8702990Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8703376Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8703775Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8704177Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8704589Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8704992Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8705409Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8705780Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8706187Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8706664Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8707057Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8707480Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8707876Z test_all_to_all (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8708265Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8708647Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8709045Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8709458Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8709850Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8710259Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8710654Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8711050Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8711498Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8711973Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8712437Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8712936Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8713419Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8713882Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8714319Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8714765Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8715597Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8716064Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8716511Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8716980Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8717461Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8717927Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8718373Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8718818Z test_average_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8719220Z test_backend_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8719590Z test_backend_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8719967Z test_barrier (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8720346Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8720724Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8721124Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8721525Z test_barrier_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8721929Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8722316Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8722739Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8723149Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8723537Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8723954Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8724393Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8724915Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8725321Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8725770Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8726194Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8726623Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8727060Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8727489Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8727868Z test_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8728361Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8728760Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8729161Z test_broadcast_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8729551Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8729953Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8730419Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8730995Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8731476Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8731902Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8732335Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8732761Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8733227Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8733709Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8734147Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8734589Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8735042Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8735469Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8735839Z test_ddp_device (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8736236Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8736667Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8737079Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8737524Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8737977Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8738385Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8738818Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8739295Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8739808Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8740373Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8741004Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8741620Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8742232Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8742920Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8743516Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8744127Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8744748Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8745311Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8745803Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8746263Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8746655Z test_ddp_inference (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8747048Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8747466Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8747918Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8748361Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8748794Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8749268Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8749748Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8750157Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8750561Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8750978Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8751420Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8751844Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8752276Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8752700Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8753120Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8753561Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8753992Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8754388Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8754813Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8755624Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8756041Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8756454Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8756931Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8757378Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8757769Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8758156Z test_destroy_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8758564Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8758975Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8759401Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8759882Z test_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8760259Z test_gather_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8760619Z test_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8761008Z test_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8761397Z test_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8761760Z test_gather_object (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8762156Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8762548Z test_get_backend (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8762900Z test_get_future (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8763268Z test_get_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8763657Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8764063Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8764450Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8764830Z test_irecv (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8765188Z test_isend (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8765562Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8766038Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8766479Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8766924Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8767382Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8767804Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8768232Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8768654Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8769092Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8769523Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8769929Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8770353Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8770772Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8771194Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8771580Z test_new_subgroups (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8771987Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8772466Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8772946Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8773427Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8773885Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8774354Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8774804Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8775247Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8775674Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8776091Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8776537Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8776996Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8777548Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8778053Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8778547Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8778978Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8779368Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8779776Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8780191Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8780589Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8781002Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8781398Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8781799Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8782167Z test_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8782537Z test_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8782922Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8783292Z test_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8783745Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8784174Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8784544Z test_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8784924Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8785319Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8785711Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8786065Z test_scatter (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8786438Z test_scatter_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8786830Z test_scatter_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8787192Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8787583Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8787986Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8788363Z test_scatter_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8788756Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8789135Z test_send_recv (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8789519Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8789929Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8790382Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8790821Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8791212Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8791623Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8792056Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8792463Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8792867Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8793288Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8793730Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8794133Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8794548Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8794964Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8795800Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8796191Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8796620Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8797081Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8797517Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.8798251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8798717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8799290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8799768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8800004Z 2022-11-23T02:07:36.8800124Z Running tests... 2022-11-23T02:07:36.8800539Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8801060Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8801661Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8802311Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19942 2022-11-23T02:07:36.8802762Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19943 2022-11-23T02:07:36.8803391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8803851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8804440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8804907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8805491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8805943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8806527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8806981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8807440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8807943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8808595Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8809301Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8809837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8810317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8810837Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8811676Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8812351Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8813184Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8813922Z [1669167807.198263] [d8f8c46cdf70:19942:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8814480Z [1669167807.207835] [d8f8c46cdf70:19943:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8814997Z [1669167807.204870] [d8f8c46cdf70:19942:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8815491Z [1669167807.204870] [d8f8c46cdf70:19942:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8815959Z [1669167807.213135] [d8f8c46cdf70:19943:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8816438Z [1669167807.213135] [d8f8c46cdf70:19943:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8816982Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8817828Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8818555Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8819404Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8820065Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8820892Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8821546Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8822370Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8823034Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8823861Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8824531Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:07:36.8825338Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:07:36.8825825Z ok (6.037s) 2022-11-23T02:07:36.8825979Z 2022-11-23T02:07:36.8826253Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8826570Z Ran 1 test in 6.037s 2022-11-23T02:07:36.8826732Z 2022-11-23T02:07:36.8826828Z OK 2022-11-23T02:07:36.8826963Z 2022-11-23T02:07:36.8827092Z Generating XML reports... 2022-11-23T02:07:36.8827709Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014322.xml 2022-11-23T02:07:36.8828420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8828877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8829464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8829923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8830221Z 2022-11-23T02:07:36.8830334Z Running tests... 2022-11-23T02:07:36.8830748Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8831283Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8831807Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.004s) 2022-11-23T02:07:36.8832122Z 2022-11-23T02:07:36.8832388Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8832720Z Ran 1 test in 0.004s 2022-11-23T02:07:36.8832886Z 2022-11-23T02:07:36.8832978Z OK (skipped=1) 2022-11-23T02:07:36.8833137Z 2022-11-23T02:07:36.8833264Z Generating XML reports... 2022-11-23T02:07:36.8833870Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014330.xml 2022-11-23T02:07:36.8834592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8835354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8835951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8836506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8836752Z 2022-11-23T02:07:36.8836867Z Running tests... 2022-11-23T02:07:36.8837258Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8837795Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8838312Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8838789Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20090 2022-11-23T02:07:36.8839244Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20091 2022-11-23T02:07:36.8839870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8840330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8840896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8841372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8841961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8842395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8842976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8843446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8843908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8844392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8845057Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8845759Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8846290Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8846748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8847092Z ok (4.197s) 2022-11-23T02:07:36.8847244Z 2022-11-23T02:07:36.8847517Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8847831Z Ran 1 test in 4.197s 2022-11-23T02:07:36.8848075Z 2022-11-23T02:07:36.8848172Z OK 2022-11-23T02:07:36.8848311Z 2022-11-23T02:07:36.8848440Z Generating XML reports... 2022-11-23T02:07:36.8849036Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014333.xml 2022-11-23T02:07:36.8849761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8850221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8850807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8851265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8851499Z 2022-11-23T02:07:36.8851610Z Running tests... 2022-11-23T02:07:36.8852020Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8852556Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8853082Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8854188Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77317 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.623s) 2022-11-23T02:07:36.8854721Z 2022-11-23T02:07:36.8854996Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8855332Z Ran 1 test in 1.623s 2022-11-23T02:07:36.8855497Z 2022-11-23T02:07:36.8855588Z OK (skipped=1) 2022-11-23T02:07:36.8855746Z 2022-11-23T02:07:36.8855875Z Generating XML reports... 2022-11-23T02:07:36.8856485Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014339.xml 2022-11-23T02:07:36.8857216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8857654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8858243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8858719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8858949Z 2022-11-23T02:07:36.8859060Z Running tests... 2022-11-23T02:07:36.8859449Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8859983Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8860528Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8861037Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20227 2022-11-23T02:07:36.8861497Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20228 2022-11-23T02:07:36.8862118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8862582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8863149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8863623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8864208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8864643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8865221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8865765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8866222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8866703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8867369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8868072Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8868602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8869094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4mv3hyv5 2022-11-23T02:07:36.8869641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4mv3hyv5/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8870158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8870644Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5v33eii0 2022-11-23T02:07:36.8871186Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5v33eii0/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8871771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8872330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8872848Z [1669167828.004141] [d8f8c46cdf70:20228:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8873371Z [1669167828.771616] [d8f8c46cdf70:20228:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8873865Z [1669167828.771616] [d8f8c46cdf70:20228:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8874398Z [1669167827.979391] [d8f8c46cdf70:20227:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8874897Z [1669167828.782639] [d8f8c46cdf70:20227:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8875735Z [1669167828.782639] [d8f8c46cdf70:20227:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8876091Z ok (5.568s) 2022-11-23T02:07:36.8876245Z 2022-11-23T02:07:36.8876528Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8876845Z Ran 1 test in 5.569s 2022-11-23T02:07:36.8877014Z 2022-11-23T02:07:36.8877112Z OK 2022-11-23T02:07:36.8877247Z 2022-11-23T02:07:36.8877374Z Generating XML reports... 2022-11-23T02:07:36.8877969Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014343.xml 2022-11-23T02:07:36.8878709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8879180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8879773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8880228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8880467Z 2022-11-23T02:07:36.8880580Z Running tests... 2022-11-23T02:07:36.8881038Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8881554Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8882123Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8882756Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20341 2022-11-23T02:07:36.8883221Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20342 2022-11-23T02:07:36.8883824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8884283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8884869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8885338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8885935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8886389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8886975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8887435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8887900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8888468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8889142Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8889827Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8890359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8890871Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyrl9eh98 2022-11-23T02:07:36.8891424Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyrl9eh98/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8891923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8892431Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0jkq40lh 2022-11-23T02:07:36.8892972Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0jkq40lh/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8893478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8893977Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8894514Z [1669167835.938750] [d8f8c46cdf70:20342:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8895032Z [1669167836.707428] [d8f8c46cdf70:20342:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8895515Z [1669167836.707428] [d8f8c46cdf70:20342:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8896042Z [1669167835.918192] [d8f8c46cdf70:20341:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8896563Z [1669167836.718110] [d8f8c46cdf70:20341:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8897042Z [1669167836.718110] [d8f8c46cdf70:20341:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8897384Z ok (5.513s) 2022-11-23T02:07:36.8897535Z 2022-11-23T02:07:36.8897810Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8898151Z Ran 1 test in 5.513s 2022-11-23T02:07:36.8898316Z 2022-11-23T02:07:36.8898413Z OK 2022-11-23T02:07:36.8898530Z 2022-11-23T02:07:36.8898722Z Generating XML reports... 2022-11-23T02:07:36.8899344Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014352.xml 2022-11-23T02:07:36.8900072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8900522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8901110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8901584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8901822Z 2022-11-23T02:07:36.8901933Z Running tests... 2022-11-23T02:07:36.8902324Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8902860Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8903421Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8903940Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20455 2022-11-23T02:07:36.8904395Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20456 2022-11-23T02:07:36.8905059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8905530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8906099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8906586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8907173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8907618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8908182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8908657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8909115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8909602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8910274Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8910973Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8911503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8911966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8912481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1fyo5xyp 2022-11-23T02:07:36.8913028Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1fyo5xyp/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8913575Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptti0qiar 2022-11-23T02:07:36.8914093Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptti0qiar/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8914658Z [1669167844.780484] [d8f8c46cdf70:20456:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8915488Z [1669167844.787760] [d8f8c46cdf70:20456:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8915979Z [1669167844.787760] [d8f8c46cdf70:20456:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8916569Z [1669167844.770838] [d8f8c46cdf70:20455:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8917088Z [1669167844.776167] [d8f8c46cdf70:20455:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8917575Z [1669167844.776167] [d8f8c46cdf70:20455:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8918067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8918544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8919033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8919517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8919980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8920460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8920939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8921467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8921936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8922414Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8922895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8923369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8923700Z ok (6.270s) 2022-11-23T02:07:36.8923849Z 2022-11-23T02:07:36.8924139Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8924473Z Ran 1 test in 6.270s 2022-11-23T02:07:36.8924637Z 2022-11-23T02:07:36.8924715Z OK 2022-11-23T02:07:36.8924851Z 2022-11-23T02:07:36.8924977Z Generating XML reports... 2022-11-23T02:07:36.8925598Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014400.xml 2022-11-23T02:07:36.8926324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8926763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8927350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8927833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8928070Z 2022-11-23T02:07:36.8928163Z Running tests... 2022-11-23T02:07:36.8928587Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8929124Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8929698Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8930230Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20573 2022-11-23T02:07:36.8930684Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20574 2022-11-23T02:07:36.8931307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8931749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8932336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8932817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8933470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8933910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8934492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8934968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8935430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8935917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8936586Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8937288Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8937828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8938288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8938803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl81cphkc 2022-11-23T02:07:36.8939421Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl81cphkc/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8939951Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw6cij4gv 2022-11-23T02:07:36.8940500Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw6cij4gv/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8941067Z [1669167853.661401] [d8f8c46cdf70:20574:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8941598Z [1669167853.666288] [d8f8c46cdf70:20574:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8942065Z [1669167853.666288] [d8f8c46cdf70:20574:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8942601Z [1669167853.652153] [d8f8c46cdf70:20573:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8943119Z [1669167853.658813] [d8f8c46cdf70:20573:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8943607Z [1669167853.658813] [d8f8c46cdf70:20573:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8944079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8944572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8945066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8945545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8945885Z ok (5.563s) 2022-11-23T02:07:36.8946036Z 2022-11-23T02:07:36.8946319Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8946662Z Ran 1 test in 5.563s 2022-11-23T02:07:36.8946827Z 2022-11-23T02:07:36.8946904Z OK 2022-11-23T02:07:36.8947044Z 2022-11-23T02:07:36.8947171Z Generating XML reports... 2022-11-23T02:07:36.8947786Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014408.xml 2022-11-23T02:07:36.8948509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8948948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8949601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8950081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8950319Z 2022-11-23T02:07:36.8950431Z Running tests... 2022-11-23T02:07:36.8950822Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8951362Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8951938Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8952474Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20691 2022-11-23T02:07:36.8952929Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20692 2022-11-23T02:07:36.8953547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8954017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8954584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8955316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8955994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8956440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8957030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8957508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8957969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8958456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8959124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8959821Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8960355Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8960818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8961325Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsh0284hb 2022-11-23T02:07:36.8961873Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsh0284hb/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8962417Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi9kw5888 2022-11-23T02:07:36.8962945Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi9kw5888/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8963515Z [1669167861.690198] [d8f8c46cdf70:20692:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8964043Z [1669167861.695826] [d8f8c46cdf70:20692:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8964535Z [1669167861.695826] [d8f8c46cdf70:20692:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8965044Z [1669167861.682480] [d8f8c46cdf70:20691:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8965564Z [1669167861.687998] [d8f8c46cdf70:20691:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8966122Z [1669167861.687998] [d8f8c46cdf70:20691:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8966612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8967083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8967572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8968061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8968523Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8968996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8969478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8969955Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8970289Z ok (5.649s) 2022-11-23T02:07:36.8970440Z 2022-11-23T02:07:36.8970718Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8971053Z Ran 1 test in 5.649s 2022-11-23T02:07:36.8971218Z 2022-11-23T02:07:36.8971295Z OK 2022-11-23T02:07:36.8971432Z 2022-11-23T02:07:36.8971560Z Generating XML reports... 2022-11-23T02:07:36.8972228Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014417.xml 2022-11-23T02:07:36.8972970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8973410Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8973996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8974473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8974715Z 2022-11-23T02:07:36.8974827Z Running tests... 2022-11-23T02:07:36.8975221Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8975757Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8976359Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8976911Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20809 2022-11-23T02:07:36.8977368Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20810 2022-11-23T02:07:36.8977991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8978450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8979016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8979496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8980083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8980520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8981161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8981639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8982095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.8982583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.8983246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8984014Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.8984550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.8985017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.8985523Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu939y8di 2022-11-23T02:07:36.8986069Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu939y8di/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8986608Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz2_plfnx 2022-11-23T02:07:36.8987129Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz2_plfnx/_remote_module_non_scriptable.py 2022-11-23T02:07:36.8987689Z [1669167869.951557] [d8f8c46cdf70:20809:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8988214Z [1669167869.958823] [d8f8c46cdf70:20809:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8988747Z [1669167869.958823] [d8f8c46cdf70:20809:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8989289Z [1669167869.960595] [d8f8c46cdf70:20810:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.8989808Z [1669167869.967322] [d8f8c46cdf70:20810:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.8990295Z [1669167869.967322] [d8f8c46cdf70:20810:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.8990784Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8991262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.8991617Z ok (5.737s) 2022-11-23T02:07:36.8991766Z 2022-11-23T02:07:36.8992050Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8992363Z Ran 1 test in 5.737s 2022-11-23T02:07:36.8992527Z 2022-11-23T02:07:36.8992631Z OK 2022-11-23T02:07:36.8992765Z 2022-11-23T02:07:36.8992890Z Generating XML reports... 2022-11-23T02:07:36.8993491Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014425.xml 2022-11-23T02:07:36.8994206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.8994661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.8995503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.8995970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.8996205Z 2022-11-23T02:07:36.8996313Z Running tests... 2022-11-23T02:07:36.8996728Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.8997264Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.8997837Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.8998407Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20927 2022-11-23T02:07:36.8998857Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20928 2022-11-23T02:07:36.8999461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9000011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9000597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9001071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9001638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9002096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9002677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9003145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9003588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9004094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9004758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9005461Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9006033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9006528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9007029Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpswti0w_v 2022-11-23T02:07:36.9007565Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpswti0w_v/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9008158Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcnbcualn 2022-11-23T02:07:36.9008702Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcnbcualn/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9009270Z [1669167878.250071] [d8f8c46cdf70:20927:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9009770Z [1669167878.255940] [d8f8c46cdf70:20927:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9010262Z [1669167878.255940] [d8f8c46cdf70:20927:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9010795Z [1669167878.257546] [d8f8c46cdf70:20928:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9011310Z [1669167878.262630] [d8f8c46cdf70:20928:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9011771Z [1669167878.262630] [d8f8c46cdf70:20928:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9012260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9012752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9013253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9013710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9014066Z ok (6.165s) 2022-11-23T02:07:36.9014216Z 2022-11-23T02:07:36.9014499Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9014829Z Ran 1 test in 6.165s 2022-11-23T02:07:36.9014974Z 2022-11-23T02:07:36.9015069Z OK 2022-11-23T02:07:36.9015203Z 2022-11-23T02:07:36.9015327Z Generating XML reports... 2022-11-23T02:07:36.9015945Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014433.xml 2022-11-23T02:07:36.9016720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9017184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9017762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9018242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9018458Z 2022-11-23T02:07:36.9018574Z Running tests... 2022-11-23T02:07:36.9018980Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9019518Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9020066Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9020614Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21045 2022-11-23T02:07:36.9021073Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21046 2022-11-23T02:07:36.9021684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9022122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9022763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9023256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9023852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9024290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9024869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9025351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9025796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9026301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9026963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9027663Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9028163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9028638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9029143Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwfz56neo 2022-11-23T02:07:36.9029694Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwfz56neo/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9030214Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeu9253dk 2022-11-23T02:07:36.9030752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeu9253dk/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9031316Z [1669167886.879669] [d8f8c46cdf70:21046:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9031841Z [1669167886.886419] [d8f8c46cdf70:21046:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9032310Z [1669167886.886419] [d8f8c46cdf70:21046:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9032833Z [1669167886.870340] [d8f8c46cdf70:21045:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9033403Z [1669167886.877144] [d8f8c46cdf70:21045:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9033878Z [1669167886.877144] [d8f8c46cdf70:21045:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9034353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9034847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9035588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9036055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9036407Z ok (6.014s) 2022-11-23T02:07:36.9036556Z 2022-11-23T02:07:36.9036838Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9037181Z Ran 1 test in 6.014s 2022-11-23T02:07:36.9037326Z 2022-11-23T02:07:36.9037422Z OK 2022-11-23T02:07:36.9037559Z 2022-11-23T02:07:36.9037685Z Generating XML reports... 2022-11-23T02:07:36.9038298Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014442.xml 2022-11-23T02:07:36.9039076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9039547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9040130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9040608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9040825Z 2022-11-23T02:07:36.9040936Z Running tests... 2022-11-23T02:07:36.9041343Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9041877Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9042450Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9043016Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21163 2022-11-23T02:07:36.9043477Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21164 2022-11-23T02:07:36.9044093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9044530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9045111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9045582Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9046171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9046599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9047179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9047657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9048103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9048601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9049273Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9049968Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9050561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9051034Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9051535Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8n9aq08t 2022-11-23T02:07:36.9052078Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8n9aq08t/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9052598Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp940tt06v 2022-11-23T02:07:36.9053129Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp940tt06v/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9053685Z [1669167895.396740] [d8f8c46cdf70:21164:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9054185Z [1669167895.402563] [d8f8c46cdf70:21164:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9054672Z [1669167895.402563] [d8f8c46cdf70:21164:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9055193Z [1669167895.393097] [d8f8c46cdf70:21163:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9055763Z [1669167895.399853] [d8f8c46cdf70:21163:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9056241Z [1669167895.399853] [d8f8c46cdf70:21163:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9056724Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9057209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9057690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9058158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9058509Z ok (5.559s) 2022-11-23T02:07:36.9058658Z 2022-11-23T02:07:36.9058934Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9059263Z Ran 1 test in 5.559s 2022-11-23T02:07:36.9059412Z 2022-11-23T02:07:36.9059509Z OK 2022-11-23T02:07:36.9059643Z 2022-11-23T02:07:36.9059768Z Generating XML reports... 2022-11-23T02:07:36.9060376Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014450.xml 2022-11-23T02:07:36.9061080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9061532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9062114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9062591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9062808Z 2022-11-23T02:07:36.9062917Z Running tests... 2022-11-23T02:07:36.9063322Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9063855Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9064393Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9065467Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/76428 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.631s) 2022-11-23T02:07:36.9065984Z 2022-11-23T02:07:36.9066312Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9066640Z Ran 1 test in 1.631s 2022-11-23T02:07:36.9066802Z 2022-11-23T02:07:36.9066894Z OK (skipped=1) 2022-11-23T02:07:36.9067048Z 2022-11-23T02:07:36.9067173Z Generating XML reports... 2022-11-23T02:07:36.9067783Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014458.xml 2022-11-23T02:07:36.9068498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9068935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9069511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9069985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9070217Z 2022-11-23T02:07:36.9070327Z Running tests... 2022-11-23T02:07:36.9070720Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9071250Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9071800Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9072368Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21315 2022-11-23T02:07:36.9072833Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21316 2022-11-23T02:07:36.9073447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9073900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9074463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9074932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9075764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9076193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9076768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9077235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9077690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9078175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9078841Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9079538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9080067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9080526Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9080907Z ok (4.364s) 2022-11-23T02:07:36.9081057Z 2022-11-23T02:07:36.9081334Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9081650Z Ran 1 test in 4.365s 2022-11-23T02:07:36.9081810Z 2022-11-23T02:07:36.9081904Z OK 2022-11-23T02:07:36.9082038Z 2022-11-23T02:07:36.9082165Z Generating XML reports... 2022-11-23T02:07:36.9082753Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014503.xml 2022-11-23T02:07:36.9083473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9083927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9084600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9085055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9085285Z 2022-11-23T02:07:36.9085394Z Running tests... 2022-11-23T02:07:36.9085803Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9086336Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9086883Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9087953Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77294 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.601s) 2022-11-23T02:07:36.9088468Z 2022-11-23T02:07:36.9088736Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9089064Z Ran 1 test in 1.602s 2022-11-23T02:07:36.9089210Z 2022-11-23T02:07:36.9089319Z OK (skipped=1) 2022-11-23T02:07:36.9089472Z 2022-11-23T02:07:36.9089597Z Generating XML reports... 2022-11-23T02:07:36.9090267Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014509.xml 2022-11-23T02:07:36.9090981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9091432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9092011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9092484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9092723Z 2022-11-23T02:07:36.9092817Z Running tests... 2022-11-23T02:07:36.9093220Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9093748Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9094263Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9094771Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21452 2022-11-23T02:07:36.9095221Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21453 2022-11-23T02:07:36.9095831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9096265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9096845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9097320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9097905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9098335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9098912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9099380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9099819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9100314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9100975Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9101729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9102238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9102714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9103237Z [1669167918.705736] [d8f8c46cdf70:21452:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9103757Z [1669167918.711187] [d8f8c46cdf70:21452:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9104238Z [1669167918.711187] [d8f8c46cdf70:21452:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9104766Z [1669167918.711378] [d8f8c46cdf70:21453:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9105282Z [1669167918.718072] [d8f8c46cdf70:21453:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9105745Z [1669167918.718072] [d8f8c46cdf70:21453:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9106143Z ok (5.527s) 2022-11-23T02:07:36.9106299Z 2022-11-23T02:07:36.9106573Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9106888Z Ran 1 test in 5.527s 2022-11-23T02:07:36.9107051Z 2022-11-23T02:07:36.9107145Z OK 2022-11-23T02:07:36.9107281Z 2022-11-23T02:07:36.9107407Z Generating XML reports... 2022-11-23T02:07:36.9108018Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014514.xml 2022-11-23T02:07:36.9108720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9109180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9109762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9110219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9110452Z 2022-11-23T02:07:36.9110567Z Running tests... 2022-11-23T02:07:36.9110977Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9111513Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9112007Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) ... skip: no torchvision (0.002s) 2022-11-23T02:07:36.9112297Z 2022-11-23T02:07:36.9112564Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9112899Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9113065Z 2022-11-23T02:07:36.9113158Z OK (skipped=1) 2022-11-23T02:07:36.9113313Z 2022-11-23T02:07:36.9113439Z Generating XML reports... 2022-11-23T02:07:36.9114045Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014522.xml 2022-11-23T02:07:36.9114766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9115446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9116029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9116500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9116734Z 2022-11-23T02:07:36.9116843Z Running tests... 2022-11-23T02:07:36.9117232Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9117771Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9118322Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.9118817Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:07:36.9119121Z 2022-11-23T02:07:36.9119393Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9119722Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9119887Z 2022-11-23T02:07:36.9119997Z OK (skipped=1) 2022-11-23T02:07:36.9120134Z 2022-11-23T02:07:36.9120258Z Generating XML reports... 2022-11-23T02:07:36.9120869Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014524.xml 2022-11-23T02:07:36.9121584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9122022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9122607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9123080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9123312Z 2022-11-23T02:07:36.9123425Z Running tests... 2022-11-23T02:07:36.9123875Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9124421Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9124905Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.9125421Z Runs multiple iterations on _test_accumulate_gradients_no_sync ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:07:36.9125731Z 2022-11-23T02:07:36.9125996Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9126327Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9126493Z 2022-11-23T02:07:36.9126603Z OK (skipped=1) 2022-11-23T02:07:36.9126741Z 2022-11-23T02:07:36.9126867Z Generating XML reports... 2022-11-23T02:07:36.9127471Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014526.xml 2022-11-23T02:07:36.9128193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9128631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9129214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9129684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9129915Z 2022-11-23T02:07:36.9130026Z Running tests... 2022-11-23T02:07:36.9130416Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9130948Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9131453Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.9132020Z Runs multiple iterations on _test_accumulate_gradients_no_sync using allreduce ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:07:36.9132331Z 2022-11-23T02:07:36.9132601Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9132930Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9133093Z 2022-11-23T02:07:36.9133202Z OK (skipped=1) 2022-11-23T02:07:36.9133359Z 2022-11-23T02:07:36.9133466Z Generating XML reports... 2022-11-23T02:07:36.9134074Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014529.xml 2022-11-23T02:07:36.9134796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9135314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9135879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9136351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9136581Z 2022-11-23T02:07:36.9136692Z Running tests... 2022-11-23T02:07:36.9137084Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9137618Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9138101Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:07:36.9138628Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:07:36.9138931Z 2022-11-23T02:07:36.9139178Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9139510Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9139672Z 2022-11-23T02:07:36.9139783Z OK (skipped=1) 2022-11-23T02:07:36.9139937Z 2022-11-23T02:07:36.9140062Z Generating XML reports... 2022-11-23T02:07:36.9140647Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014531.xml 2022-11-23T02:07:36.9141413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9141872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9142440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9142909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9143139Z 2022-11-23T02:07:36.9143250Z Running tests... 2022-11-23T02:07:36.9143657Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9144174Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9144677Z test_all_gather (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9145159Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21731 2022-11-23T02:07:36.9145597Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21732 2022-11-23T02:07:36.9146207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9146666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9147245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9147698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9148281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9148737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9149297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9149763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9150223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9150724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9151373Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9152066Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9152658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9153250Z STAGE:2022-11-23 01:45:37 21732:21732 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9153716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9154300Z STAGE:2022-11-23 01:45:38 21731:21731 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9154831Z [1669167938.023459] [d8f8c46cdf70:21731:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9155575Z [1669167939.066827] [d8f8c46cdf70:21731:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9156050Z [1669167939.066827] [d8f8c46cdf70:21731:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9156578Z [1669167938.044440] [d8f8c46cdf70:21732:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9157094Z [1669167939.054625] [d8f8c46cdf70:21732:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9157649Z [1669167939.054625] [d8f8c46cdf70:21732:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9158466Z STAGE:2022-11-23 01:45:39 21731:21731 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:45:39 21732:21732 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9158856Z 2022-11-23T02:07:36.9159211Z STAGE:2022-11-23 01:45:39 21732:21732 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9159823Z STAGE:2022-11-23 01:45:39 21731:21731 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9160419Z STAGE:2022-11-23 01:45:39 21732:21732 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9160977Z STAGE:2022-11-23 01:45:39 21731:21731 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9161565Z STAGE:2022-11-23 01:45:39 21732:21732 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9162161Z STAGE:2022-11-23 01:45:39 21731:21731 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9162760Z STAGE:2022-11-23 01:45:39 21732:21732 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9163350Z STAGE:2022-11-23 01:45:39 21731:21731 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9163713Z ok (5.864s) 2022-11-23T02:07:36.9163863Z 2022-11-23T02:07:36.9164135Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9164450Z Ran 1 test in 5.864s 2022-11-23T02:07:36.9164612Z 2022-11-23T02:07:36.9164712Z OK 2022-11-23T02:07:36.9164848Z 2022-11-23T02:07:36.9164976Z Generating XML reports... 2022-11-23T02:07:36.9165565Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014534.xml 2022-11-23T02:07:36.9166288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9166750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9167336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9167796Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9168026Z 2022-11-23T02:07:36.9168139Z Running tests... 2022-11-23T02:07:36.9168550Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9169084Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9169687Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:07:36.9170008Z 2022-11-23T02:07:36.9170274Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9170600Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9170763Z 2022-11-23T02:07:36.9170857Z OK (skipped=1) 2022-11-23T02:07:36.9171017Z 2022-11-23T02:07:36.9171143Z Generating XML reports... 2022-11-23T02:07:36.9171749Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014542.xml 2022-11-23T02:07:36.9172518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9172958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9173537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9174014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9174291Z 2022-11-23T02:07:36.9174402Z Running tests... 2022-11-23T02:07:36.9174793Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9175325Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9175938Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:07:36.9176270Z 2022-11-23T02:07:36.9176521Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9176851Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9176871Z 2022-11-23T02:07:36.9176980Z OK (skipped=1) 2022-11-23T02:07:36.9176998Z 2022-11-23T02:07:36.9177123Z Generating XML reports... 2022-11-23T02:07:36.9177578Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014544.xml 2022-11-23T02:07:36.9177960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9178141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9178512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9178708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9178727Z 2022-11-23T02:07:36.9178838Z Running tests... 2022-11-23T02:07:36.9179104Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9179416Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9179709Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:07:36.9179732Z 2022-11-23T02:07:36.9179998Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9180113Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9180133Z 2022-11-23T02:07:36.9180243Z OK (skipped=1) 2022-11-23T02:07:36.9180263Z 2022-11-23T02:07:36.9180370Z Generating XML reports... 2022-11-23T02:07:36.9180861Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014547.xml 2022-11-23T02:07:36.9181242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9181420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9181806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9182004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9182024Z 2022-11-23T02:07:36.9182194Z Running tests... 2022-11-23T02:07:36.9182461Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9182756Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9183052Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:07:36.9183073Z 2022-11-23T02:07:36.9183340Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9183455Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9183475Z 2022-11-23T02:07:36.9183584Z OK (skipped=1) 2022-11-23T02:07:36.9183603Z 2022-11-23T02:07:36.9183727Z Generating XML reports... 2022-11-23T02:07:36.9184177Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014549.xml 2022-11-23T02:07:36.9184553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9184738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9185107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9185303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9185323Z 2022-11-23T02:07:36.9185477Z Running tests... 2022-11-23T02:07:36.9185751Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9186065Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9186363Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.003s) 2022-11-23T02:07:36.9186384Z 2022-11-23T02:07:36.9186647Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9186758Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9186781Z 2022-11-23T02:07:36.9186889Z OK (skipped=1) 2022-11-23T02:07:36.9186908Z 2022-11-23T02:07:36.9187015Z Generating XML reports... 2022-11-23T02:07:36.9187465Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014552.xml 2022-11-23T02:07:36.9187845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9188025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9188408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9188602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9188622Z 2022-11-23T02:07:36.9188735Z Running tests... 2022-11-23T02:07:36.9189000Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9189315Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9189565Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9189790Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22010 2022-11-23T02:07:36.9190005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22011 2022-11-23T02:07:36.9190384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9190561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9190948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9191144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9191512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9191733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9192114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9192305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9192560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9192800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9193206Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9193608Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9193843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9194191Z STAGE:2022-11-23 01:45:58 22010:22010 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9194408Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9194750Z STAGE:2022-11-23 01:45:58 22011:22011 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9195313Z [1669167958.470940] [d8f8c46cdf70:22011:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9195578Z [1669167959.496343] [d8f8c46cdf70:22011:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9195824Z [1669167959.496343] [d8f8c46cdf70:22011:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9196107Z [1669167958.450449] [d8f8c46cdf70:22010:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9196349Z [1669167959.488722] [d8f8c46cdf70:22010:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9196593Z [1669167959.488722] [d8f8c46cdf70:22010:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9197168Z STAGE:2022-11-23 01:45:59 22011:22011 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:45:59 22010:22010 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9197189Z 2022-11-23T02:07:36.9197543Z STAGE:2022-11-23 01:45:59 22010:22010 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9197896Z STAGE:2022-11-23 01:45:59 22011:22011 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9198211Z STAGE:2022-11-23 01:45:59 22010:22010 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9198549Z STAGE:2022-11-23 01:45:59 22011:22011 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9198892Z STAGE:2022-11-23 01:45:59 22011:22011 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9199457Z STAGE:2022-11-23 01:45:59 22010:22010 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:45:59 22011:22011 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9199477Z 2022-11-23T02:07:36.9199830Z STAGE:2022-11-23 01:45:59 22010:22010 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9230732Z ok (5.860s) 2022-11-23T02:07:36.9230764Z 2022-11-23T02:07:36.9231121Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9231247Z Ran 1 test in 5.861s 2022-11-23T02:07:36.9231267Z 2022-11-23T02:07:36.9231343Z OK 2022-11-23T02:07:36.9231382Z 2022-11-23T02:07:36.9231647Z Generating XML reports... 2022-11-23T02:07:36.9232127Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014554.xml 2022-11-23T02:07:36.9232510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9232690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9233083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9233282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9233302Z 2022-11-23T02:07:36.9233413Z Running tests... 2022-11-23T02:07:36.9233681Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9233979Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9234252Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T02:07:36.9234276Z 2022-11-23T02:07:36.9234538Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9234654Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9234673Z 2022-11-23T02:07:36.9234782Z OK (skipped=1) 2022-11-23T02:07:36.9234802Z 2022-11-23T02:07:36.9234927Z Generating XML reports... 2022-11-23T02:07:36.9235736Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014602.xml 2022-11-23T02:07:36.9236139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9236319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9236689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9236884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9236909Z 2022-11-23T02:07:36.9237021Z Running tests... 2022-11-23T02:07:36.9237287Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9237602Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9237887Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T02:07:36.9237908Z 2022-11-23T02:07:36.9238176Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9238291Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9238311Z 2022-11-23T02:07:36.9238401Z OK (skipped=1) 2022-11-23T02:07:36.9238439Z 2022-11-23T02:07:36.9238548Z Generating XML reports... 2022-11-23T02:07:36.9239003Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014605.xml 2022-11-23T02:07:36.9239380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9239565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9239949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9240149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9240169Z 2022-11-23T02:07:36.9240281Z Running tests... 2022-11-23T02:07:36.9240547Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9240844Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9241113Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9241336Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22190 2022-11-23T02:07:36.9241632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22191 2022-11-23T02:07:36.9242011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9242188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9242578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9242775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9243126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9243302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9243684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9243879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9244135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9244383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9244838Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9245253Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9245495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9245725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9245952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9246192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9246602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9247004Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9247353Z STAGE:2022-11-23 01:46:11 22191:22191 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9247685Z STAGE:2022-11-23 01:46:11 22190:22190 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9247971Z [1669167971.634126] [d8f8c46cdf70:22191:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9248214Z [1669167972.650802] [d8f8c46cdf70:22191:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9248462Z [1669167972.650802] [d8f8c46cdf70:22191:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9248731Z [1669167971.613389] [d8f8c46cdf70:22190:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9248967Z [1669167972.648175] [d8f8c46cdf70:22190:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9249213Z [1669167972.648175] [d8f8c46cdf70:22190:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9249776Z STAGE:2022-11-23 01:46:13 22191:22191 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:46:13 22190:22190 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9249797Z 2022-11-23T02:07:36.9250155Z STAGE:2022-11-23 01:46:13 22191:22191 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9250572Z STAGE:2022-11-23 01:46:13 22190:22190 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9250907Z STAGE:2022-11-23 01:46:13 22191:22191 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9251235Z STAGE:2022-11-23 01:46:13 22190:22190 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9251576Z STAGE:2022-11-23 01:46:13 22191:22191 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9251906Z STAGE:2022-11-23 01:46:13 22190:22190 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9252240Z STAGE:2022-11-23 01:46:13 22191:22191 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9252591Z STAGE:2022-11-23 01:46:13 22190:22190 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9252699Z ok (5.816s) 2022-11-23T02:07:36.9252719Z 2022-11-23T02:07:36.9252991Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9253102Z Ran 1 test in 5.817s 2022-11-23T02:07:36.9253122Z 2022-11-23T02:07:36.9253212Z OK 2022-11-23T02:07:36.9253231Z 2022-11-23T02:07:36.9253359Z Generating XML reports... 2022-11-23T02:07:36.9253816Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014607.xml 2022-11-23T02:07:36.9254243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9254415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9254801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9254997Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9255017Z 2022-11-23T02:07:36.9255128Z Running tests... 2022-11-23T02:07:36.9255399Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9255719Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9255982Z test_all_gather_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9256207Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22304 2022-11-23T02:07:36.9256414Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22305 2022-11-23T02:07:36.9256793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9256974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9257355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9257549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9257925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9258105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9258492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9258687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9258918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9259165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9259573Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9259973Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9260264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9260499Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9260661Z skip: Skipped due to small world size. (4.259s) 2022-11-23T02:07:36.9260682Z 2022-11-23T02:07:36.9260958Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9261056Z Ran 1 test in 4.259s 2022-11-23T02:07:36.9261095Z 2022-11-23T02:07:36.9261187Z OK (skipped=1) 2022-11-23T02:07:36.9261206Z 2022-11-23T02:07:36.9261333Z Generating XML reports... 2022-11-23T02:07:36.9261783Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014616.xml 2022-11-23T02:07:36.9262166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9262347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9262733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9262928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9262948Z 2022-11-23T02:07:36.9263060Z Running tests... 2022-11-23T02:07:36.9263352Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9263679Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9263980Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T02:07:36.9264001Z 2022-11-23T02:07:36.9264265Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9264380Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9264400Z 2022-11-23T02:07:36.9264510Z OK (skipped=1) 2022-11-23T02:07:36.9264534Z 2022-11-23T02:07:36.9264660Z Generating XML reports... 2022-11-23T02:07:36.9265112Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014622.xml 2022-11-23T02:07:36.9265495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9265659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9266045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9266240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9266260Z 2022-11-23T02:07:36.9266370Z Running tests... 2022-11-23T02:07:36.9266635Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9266949Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9267254Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T02:07:36.9267274Z 2022-11-23T02:07:36.9267536Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9267651Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9267670Z 2022-11-23T02:07:36.9267761Z OK (skipped=1) 2022-11-23T02:07:36.9267784Z 2022-11-23T02:07:36.9267912Z Generating XML reports... 2022-11-23T02:07:36.9268361Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014625.xml 2022-11-23T02:07:36.9268738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9268920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9269307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9269558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9269579Z 2022-11-23T02:07:36.9269690Z Running tests... 2022-11-23T02:07:36.9269939Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9270252Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9270548Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T02:07:36.9270567Z 2022-11-23T02:07:36.9270830Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9270944Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9270963Z 2022-11-23T02:07:36.9271072Z OK (skipped=1) 2022-11-23T02:07:36.9271091Z 2022-11-23T02:07:36.9271216Z Generating XML reports... 2022-11-23T02:07:36.9271663Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014627.xml 2022-11-23T02:07:36.9272046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9272208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9272654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9272857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9272877Z 2022-11-23T02:07:36.9272986Z Running tests... 2022-11-23T02:07:36.9273252Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9273565Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9273867Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T02:07:36.9273890Z 2022-11-23T02:07:36.9274154Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9274267Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9274286Z 2022-11-23T02:07:36.9274378Z OK (skipped=1) 2022-11-23T02:07:36.9274397Z 2022-11-23T02:07:36.9274524Z Generating XML reports... 2022-11-23T02:07:36.9274976Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014630.xml 2022-11-23T02:07:36.9275585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9275764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9276151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9276347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9276368Z 2022-11-23T02:07:36.9276478Z Running tests... 2022-11-23T02:07:36.9276752Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9277046Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9277328Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9277555Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22539 2022-11-23T02:07:36.9277782Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22540 2022-11-23T02:07:36.9278157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9278337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9278723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9278918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9279363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9279544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9279930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9280123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9280375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9280624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9281068Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9281474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9281715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9281929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9282283Z [1669167996.415667] [d8f8c46cdf70:22540:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9282537Z [1669167997.198269] [d8f8c46cdf70:22540:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9282785Z [1669167997.198269] [d8f8c46cdf70:22540:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9283065Z [1669167996.413135] [d8f8c46cdf70:22539:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9283306Z [1669167997.215688] [d8f8c46cdf70:22539:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9283554Z [1669167997.215688] [d8f8c46cdf70:22539:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9283661Z ok (6.061s) 2022-11-23T02:07:36.9283681Z 2022-11-23T02:07:36.9283956Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9284073Z Ran 1 test in 6.061s 2022-11-23T02:07:36.9284093Z 2022-11-23T02:07:36.9284170Z OK 2022-11-23T02:07:36.9284190Z 2022-11-23T02:07:36.9284317Z Generating XML reports... 2022-11-23T02:07:36.9284773Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014632.xml 2022-11-23T02:07:36.9285149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9285328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9285716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9285913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9285933Z 2022-11-23T02:07:36.9286044Z Running tests... 2022-11-23T02:07:36.9286294Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9286611Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9286890Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9287112Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22650 2022-11-23T02:07:36.9287337Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22651 2022-11-23T02:07:36.9287710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9287950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9288336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9288529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9288879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9289057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9289441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9289633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9289885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9290137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9290541Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9290988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9291228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9291443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9291684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9291925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9292333Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9292730Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9292971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:36.9293215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:36.9293608Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:36.9294008Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:36.9294233Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:07:36.9294468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:07:36.9294869Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:07:36.9295264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:07:36.9295549Z [1669168004.989156] [d8f8c46cdf70:22651:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9295788Z [1669168005.775712] [d8f8c46cdf70:22651:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9296029Z [1669168005.775712] [d8f8c46cdf70:22651:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9296301Z [1669168004.989177] [d8f8c46cdf70:22650:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9296591Z [1669168005.769919] [d8f8c46cdf70:22650:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9296830Z [1669168005.769919] [d8f8c46cdf70:22650:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9296918Z ok (6.458s) 2022-11-23T02:07:36.9296937Z 2022-11-23T02:07:36.9297207Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9297315Z Ran 1 test in 6.458s 2022-11-23T02:07:36.9297334Z 2022-11-23T02:07:36.9297416Z OK 2022-11-23T02:07:36.9297435Z 2022-11-23T02:07:36.9297553Z Generating XML reports... 2022-11-23T02:07:36.9298002Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014641.xml 2022-11-23T02:07:36.9298377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9298558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9298928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9299119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9299139Z 2022-11-23T02:07:36.9299246Z Running tests... 2022-11-23T02:07:36.9299558Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9299879Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9300133Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports all_gather_v (0.003s) 2022-11-23T02:07:36.9300155Z 2022-11-23T02:07:36.9300410Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9300518Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9300537Z 2022-11-23T02:07:36.9300638Z OK (skipped=1) 2022-11-23T02:07:36.9300662Z 2022-11-23T02:07:36.9300769Z Generating XML reports... 2022-11-23T02:07:36.9301213Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014650.xml 2022-11-23T02:07:36.9301584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9301760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9302141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9302322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9302342Z 2022-11-23T02:07:36.9302440Z Running tests... 2022-11-23T02:07:36.9302696Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9302998Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9303407Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9303427Z 2022-11-23T02:07:36.9303679Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9303781Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9303801Z 2022-11-23T02:07:36.9303899Z OK (skipped=1) 2022-11-23T02:07:36.9303922Z 2022-11-23T02:07:36.9304038Z Generating XML reports... 2022-11-23T02:07:36.9304476Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014652.xml 2022-11-23T02:07:36.9304841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9305008Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9305380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9305620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9305641Z 2022-11-23T02:07:36.9305740Z Running tests... 2022-11-23T02:07:36.9305998Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9306299Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9306713Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9306733Z 2022-11-23T02:07:36.9306985Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9307087Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9307106Z 2022-11-23T02:07:36.9307203Z OK (skipped=1) 2022-11-23T02:07:36.9307223Z 2022-11-23T02:07:36.9307329Z Generating XML reports... 2022-11-23T02:07:36.9307767Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014654.xml 2022-11-23T02:07:36.9308134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9308302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9308723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9308914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9308934Z 2022-11-23T02:07:36.9309031Z Running tests... 2022-11-23T02:07:36.9309286Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9309587Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9310000Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9310029Z 2022-11-23T02:07:36.9310274Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9310376Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9310395Z 2022-11-23T02:07:36.9310492Z OK (skipped=1) 2022-11-23T02:07:36.9310511Z 2022-11-23T02:07:36.9310624Z Generating XML reports... 2022-11-23T02:07:36.9311065Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014657.xml 2022-11-23T02:07:36.9311436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9311604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9311975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9312152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9312171Z 2022-11-23T02:07:36.9312277Z Running tests... 2022-11-23T02:07:36.9312542Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9312848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9313260Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9313283Z 2022-11-23T02:07:36.9313538Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9313647Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9313667Z 2022-11-23T02:07:36.9313763Z OK (skipped=1) 2022-11-23T02:07:36.9313782Z 2022-11-23T02:07:36.9313896Z Generating XML reports... 2022-11-23T02:07:36.9314328Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014659.xml 2022-11-23T02:07:36.9314695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9314931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9315613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9315797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9315818Z 2022-11-23T02:07:36.9315923Z Running tests... 2022-11-23T02:07:36.9316189Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9316499Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9316896Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9316923Z 2022-11-23T02:07:36.9317168Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9317269Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9317292Z 2022-11-23T02:07:36.9317393Z OK (skipped=1) 2022-11-23T02:07:36.9317412Z 2022-11-23T02:07:36.9317536Z Generating XML reports... 2022-11-23T02:07:36.9317979Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014702.xml 2022-11-23T02:07:36.9318422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9318602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9318976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9319152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9319179Z 2022-11-23T02:07:36.9319271Z Running tests... 2022-11-23T02:07:36.9319525Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9319825Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9320231Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9320251Z 2022-11-23T02:07:36.9320507Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9320614Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9320638Z 2022-11-23T02:07:36.9320735Z OK (skipped=1) 2022-11-23T02:07:36.9320754Z 2022-11-23T02:07:36.9320871Z Generating XML reports... 2022-11-23T02:07:36.9321301Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014704.xml 2022-11-23T02:07:36.9321677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9321854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9322234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9322422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9322441Z 2022-11-23T02:07:36.9322543Z Running tests... 2022-11-23T02:07:36.9322807Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9323117Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9323541Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9323562Z 2022-11-23T02:07:36.9323807Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9323919Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9323938Z 2022-11-23T02:07:36.9324040Z OK (skipped=1) 2022-11-23T02:07:36.9324059Z 2022-11-23T02:07:36.9324180Z Generating XML reports... 2022-11-23T02:07:36.9324623Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014706.xml 2022-11-23T02:07:36.9325062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9325231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9325613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9325807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9325827Z 2022-11-23T02:07:36.9325919Z Running tests... 2022-11-23T02:07:36.9326183Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9326492Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9326905Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9326928Z 2022-11-23T02:07:36.9327191Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9327304Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9327324Z 2022-11-23T02:07:36.9327431Z OK (skipped=1) 2022-11-23T02:07:36.9327450Z 2022-11-23T02:07:36.9327563Z Generating XML reports... 2022-11-23T02:07:36.9328040Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014709.xml 2022-11-23T02:07:36.9328416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9328582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9328953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9329135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9329159Z 2022-11-23T02:07:36.9329258Z Running tests... 2022-11-23T02:07:36.9329513Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9329814Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9330208Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9330228Z 2022-11-23T02:07:36.9330473Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9330576Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9330596Z 2022-11-23T02:07:36.9330702Z OK (skipped=1) 2022-11-23T02:07:36.9330721Z 2022-11-23T02:07:36.9330845Z Generating XML reports... 2022-11-23T02:07:36.9331290Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014711.xml 2022-11-23T02:07:36.9331663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9331842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9332218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9332409Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9332432Z 2022-11-23T02:07:36.9332525Z Running tests... 2022-11-23T02:07:36.9332785Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9333092Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9333389Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9333601Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23112 2022-11-23T02:07:36.9333817Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23113 2022-11-23T02:07:36.9334252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9334427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9334819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9334996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9335357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9335521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9335892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9336075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9336317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9336551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9336946Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9337385Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9337617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9338358Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:07:36.9338469Z warnings.warn( 2022-11-23T02:07:36.9338688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9339415Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:07:36.9339520Z warnings.warn( 2022-11-23T02:07:36.9339612Z ok (4.348s) 2022-11-23T02:07:36.9339632Z 2022-11-23T02:07:36.9339891Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9339993Z Ran 1 test in 4.349s 2022-11-23T02:07:36.9340013Z 2022-11-23T02:07:36.9340090Z OK 2022-11-23T02:07:36.9340109Z 2022-11-23T02:07:36.9340223Z Generating XML reports... 2022-11-23T02:07:36.9340659Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014713.xml 2022-11-23T02:07:36.9341026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9341195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9341566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9341753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9341773Z 2022-11-23T02:07:36.9341873Z Running tests... 2022-11-23T02:07:36.9342121Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9342424Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9342814Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9342834Z 2022-11-23T02:07:36.9343145Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9343249Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9343268Z 2022-11-23T02:07:36.9343367Z OK (skipped=1) 2022-11-23T02:07:36.9343386Z 2022-11-23T02:07:36.9343499Z Generating XML reports... 2022-11-23T02:07:36.9343938Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014720.xml 2022-11-23T02:07:36.9344299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9344460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9344830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9345014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9345033Z 2022-11-23T02:07:36.9345133Z Running tests... 2022-11-23T02:07:36.9345389Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9345695Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9346098Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9346118Z 2022-11-23T02:07:36.9346426Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9346542Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9346562Z 2022-11-23T02:07:36.9346654Z OK (skipped=1) 2022-11-23T02:07:36.9346673Z 2022-11-23T02:07:36.9346794Z Generating XML reports... 2022-11-23T02:07:36.9347238Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014723.xml 2022-11-23T02:07:36.9347607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9347783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9348157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9348341Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9348361Z 2022-11-23T02:07:36.9348467Z Running tests... 2022-11-23T02:07:36.9348736Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9349034Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9349422Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:36.9349442Z 2022-11-23T02:07:36.9349692Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9349794Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9349813Z 2022-11-23T02:07:36.9349912Z OK (skipped=1) 2022-11-23T02:07:36.9349935Z 2022-11-23T02:07:36.9350050Z Generating XML reports... 2022-11-23T02:07:36.9350489Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014725.xml 2022-11-23T02:07:36.9350852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9351022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9351386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9351570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9351589Z 2022-11-23T02:07:36.9351688Z Running tests... 2022-11-23T02:07:36.9351940Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9352240Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9352579Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9352790Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23314 2022-11-23T02:07:36.9352998Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23315 2022-11-23T02:07:36.9353359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9353529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9353903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9354083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9354440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9354608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9354983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9355369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9355681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9355936Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9356350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9356749Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9356973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9357196Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9357290Z ok (4.283s) 2022-11-23T02:07:36.9357310Z 2022-11-23T02:07:36.9357568Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9357673Z Ran 1 test in 4.283s 2022-11-23T02:07:36.9357693Z 2022-11-23T02:07:36.9357771Z OK 2022-11-23T02:07:36.9357798Z 2022-11-23T02:07:36.9357910Z Generating XML reports... 2022-11-23T02:07:36.9358355Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014728.xml 2022-11-23T02:07:36.9358726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9358896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9359272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9359461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9359480Z 2022-11-23T02:07:36.9359585Z Running tests... 2022-11-23T02:07:36.9359849Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9360145Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9360421Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9360635Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23417 2022-11-23T02:07:36.9360848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23418 2022-11-23T02:07:36.9361222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9361392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9361767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9362032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9362388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9362563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9362938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9363122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9363357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9363592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9363983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9364376Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9364599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9364876Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9365102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9365327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9365720Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9366101Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9366446Z STAGE:2022-11-23 01:47:38 23417:23417 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9366774Z STAGE:2022-11-23 01:47:38 23418:23418 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9367051Z [1669168058.834614] [d8f8c46cdf70:23418:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9367287Z [1669168059.866727] [d8f8c46cdf70:23418:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9367530Z [1669168059.866727] [d8f8c46cdf70:23418:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9367792Z [1669168058.832147] [d8f8c46cdf70:23417:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9368028Z [1669168059.887689] [d8f8c46cdf70:23417:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9368268Z [1669168059.887689] [d8f8c46cdf70:23417:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9368826Z STAGE:2022-11-23 01:47:40 23418:23418 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:47:40 23417:23417 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9368848Z 2022-11-23T02:07:36.9369414Z STAGE:2022-11-23 01:47:40 23418:23418 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:47:40 23417:23417 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9369434Z 2022-11-23T02:07:36.9369763Z STAGE:2022-11-23 01:47:40 23418:23418 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9370089Z STAGE:2022-11-23 01:47:40 23417:23417 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9370477Z STAGE:2022-11-23 01:47:40 23418:23418 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9371029Z STAGE:2022-11-23 01:47:40 23418:23418 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:47:40 23417:23417 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9371049Z 2022-11-23T02:07:36.9371397Z STAGE:2022-11-23 01:47:40 23417:23417 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9371502Z ok (5.874s) 2022-11-23T02:07:36.9371522Z 2022-11-23T02:07:36.9371772Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9371884Z Ran 1 test in 5.874s 2022-11-23T02:07:36.9371904Z 2022-11-23T02:07:36.9371990Z OK 2022-11-23T02:07:36.9372009Z 2022-11-23T02:07:36.9372131Z Generating XML reports... 2022-11-23T02:07:36.9372575Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014734.xml 2022-11-23T02:07:36.9372951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9373127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9373552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9373752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9373772Z 2022-11-23T02:07:36.9373866Z Running tests... 2022-11-23T02:07:36.9374125Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9374430Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9374699Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9374919Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23531 2022-11-23T02:07:36.9375135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23532 2022-11-23T02:07:36.9375509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9375679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9376045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9376235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9376593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9376765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9377144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9377333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9377577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9377824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9378223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9378607Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9378837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9379078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9379292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9379584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9379982Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9380371Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9380705Z STAGE:2022-11-23 01:47:47 23532:23532 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9381066Z STAGE:2022-11-23 01:47:47 23531:23531 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9381332Z [1669168067.266201] [d8f8c46cdf70:23532:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9381561Z [1669168068.282636] [d8f8c46cdf70:23532:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9381806Z [1669168068.282636] [d8f8c46cdf70:23532:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9382086Z [1669168067.243988] [d8f8c46cdf70:23531:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9382362Z [1669168068.281138] [d8f8c46cdf70:23531:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9382606Z [1669168068.281138] [d8f8c46cdf70:23531:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9383157Z STAGE:2022-11-23 01:47:48 23532:23532 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:47:48 23531:23531 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9383180Z 2022-11-23T02:07:36.9383529Z STAGE:2022-11-23 01:47:48 23532:23532 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9383873Z STAGE:2022-11-23 01:47:48 23531:23531 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9384205Z STAGE:2022-11-23 01:47:48 23532:23532 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9384530Z STAGE:2022-11-23 01:47:48 23531:23531 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9384851Z STAGE:2022-11-23 01:47:48 23532:23532 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9385172Z STAGE:2022-11-23 01:47:48 23531:23531 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9385516Z STAGE:2022-11-23 01:47:48 23532:23532 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9385855Z STAGE:2022-11-23 01:47:48 23531:23531 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9385960Z ok (5.866s) 2022-11-23T02:07:36.9385980Z 2022-11-23T02:07:36.9386246Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9386359Z Ran 1 test in 5.866s 2022-11-23T02:07:36.9386378Z 2022-11-23T02:07:36.9386469Z OK 2022-11-23T02:07:36.9386488Z 2022-11-23T02:07:36.9386610Z Generating XML reports... 2022-11-23T02:07:36.9387054Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014743.xml 2022-11-23T02:07:36.9387430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9387606Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9387984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9388175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9388292Z 2022-11-23T02:07:36.9388406Z Running tests... 2022-11-23T02:07:36.9388672Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9388985Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9389251Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9389466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23645 2022-11-23T02:07:36.9389678Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23646 2022-11-23T02:07:36.9390050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9390219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9390601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9390791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9391151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9391325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9391736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9391927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9392169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9392412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9392819Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9393214Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9393441Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9393680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9393893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9394122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9394518Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9394851Z STAGE:2022-11-23 01:47:55 23646:23646 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9395458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9395802Z STAGE:2022-11-23 01:47:55 23645:23645 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9396080Z [1669168075.722557] [d8f8c46cdf70:23645:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9396319Z [1669168076.759961] [d8f8c46cdf70:23645:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9396557Z [1669168076.759961] [d8f8c46cdf70:23645:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9396828Z [1669168075.725153] [d8f8c46cdf70:23646:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9397046Z [1669168076.780583] [d8f8c46cdf70:23646:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9397375Z [1669168076.780583] [d8f8c46cdf70:23646:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9397930Z STAGE:2022-11-23 01:47:57 23645:23645 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:47:57 23646:23646 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9397956Z 2022-11-23T02:07:36.9398528Z STAGE:2022-11-23 01:47:57 23645:23645 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:47:57 23646:23646 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9398549Z 2022-11-23T02:07:36.9398874Z STAGE:2022-11-23 01:47:57 23645:23645 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9399192Z STAGE:2022-11-23 01:47:57 23646:23646 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9399527Z STAGE:2022-11-23 01:47:57 23645:23645 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9399879Z STAGE:2022-11-23 01:47:57 23645:23645 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9400208Z STAGE:2022-11-23 01:47:57 23646:23646 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9400622Z STAGE:2022-11-23 01:47:57 23646:23646 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9400730Z ok (5.965s) 2022-11-23T02:07:36.9400750Z 2022-11-23T02:07:36.9401003Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9401109Z Ran 1 test in 5.965s 2022-11-23T02:07:36.9401129Z 2022-11-23T02:07:36.9401212Z OK 2022-11-23T02:07:36.9401232Z 2022-11-23T02:07:36.9401348Z Generating XML reports... 2022-11-23T02:07:36.9401797Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014751.xml 2022-11-23T02:07:36.9402178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9402348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9402728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9402923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9402942Z 2022-11-23T02:07:36.9403036Z Running tests... 2022-11-23T02:07:36.9403292Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9403596Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9403872Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9404090Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23759 2022-11-23T02:07:36.9404304Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23760 2022-11-23T02:07:36.9404675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9404843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9405211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9405399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9405768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9405935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9406314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9406505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9406807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9407050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9407459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9407847Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9408071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9408312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9408529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9408761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9409169Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9409561Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9409941Z STAGE:2022-11-23 01:48:04 23759:23759 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9410277Z STAGE:2022-11-23 01:48:04 23760:23760 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9410546Z [1669168084.211223] [d8f8c46cdf70:23760:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9410775Z [1669168085.240918] [d8f8c46cdf70:23760:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9411022Z [1669168085.240918] [d8f8c46cdf70:23760:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9411302Z [1669168084.190819] [d8f8c46cdf70:23759:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9411531Z [1669168085.220963] [d8f8c46cdf70:23759:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9411768Z [1669168085.220963] [d8f8c46cdf70:23759:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9412333Z STAGE:2022-11-23 01:48:05 23760:23760 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:48:05 23759:23759 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9412355Z 2022-11-23T02:07:36.9412926Z STAGE:2022-11-23 01:48:05 23760:23760 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:48:05 23759:23759 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9412951Z 2022-11-23T02:07:36.9413277Z STAGE:2022-11-23 01:48:05 23760:23760 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9413596Z STAGE:2022-11-23 01:48:05 23759:23759 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9413924Z STAGE:2022-11-23 01:48:05 23760:23760 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9414237Z STAGE:2022-11-23 01:48:05 23759:23759 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9414581Z STAGE:2022-11-23 01:48:05 23760:23760 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9414927Z STAGE:2022-11-23 01:48:05 23759:23759 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9415025Z ok (5.900s) 2022-11-23T02:07:36.9415044Z 2022-11-23T02:07:36.9415368Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9415484Z Ran 1 test in 5.900s 2022-11-23T02:07:36.9415504Z 2022-11-23T02:07:36.9415591Z OK 2022-11-23T02:07:36.9415610Z 2022-11-23T02:07:36.9415733Z Generating XML reports... 2022-11-23T02:07:36.9416189Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014800.xml 2022-11-23T02:07:36.9416550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9416730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9417106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9417294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9417314Z 2022-11-23T02:07:36.9417415Z Running tests... 2022-11-23T02:07:36.9417678Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9417989Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9418253Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9418507Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23873 2022-11-23T02:07:36.9418729Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23874 2022-11-23T02:07:36.9419103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9419277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9419647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9419834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9420209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9420381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9420749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9420928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9421173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9421415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9421810Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9422207Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9422445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9422675Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9422829Z skip: Skipped due to small world size. (4.275s) 2022-11-23T02:07:36.9422849Z 2022-11-23T02:07:36.9423114Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9423212Z Ran 1 test in 4.275s 2022-11-23T02:07:36.9423231Z 2022-11-23T02:07:36.9423337Z OK (skipped=1) 2022-11-23T02:07:36.9423357Z 2022-11-23T02:07:36.9423479Z Generating XML reports... 2022-11-23T02:07:36.9423924Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014808.xml 2022-11-23T02:07:36.9424295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9424525Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9424903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9425092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9425111Z 2022-11-23T02:07:36.9425204Z Running tests... 2022-11-23T02:07:36.9425476Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9425783Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9426038Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9426255Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23976 2022-11-23T02:07:36.9426460Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23977 2022-11-23T02:07:36.9426825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9427003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9427378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9427603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9427976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9428145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9428514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9428698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9428943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9429182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9429582Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9429969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9430199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9430431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9430583Z skip: Skipped due to small world size. (4.233s) 2022-11-23T02:07:36.9430603Z 2022-11-23T02:07:36.9430861Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9430969Z Ran 1 test in 4.233s 2022-11-23T02:07:36.9430989Z 2022-11-23T02:07:36.9431092Z OK (skipped=1) 2022-11-23T02:07:36.9431112Z 2022-11-23T02:07:36.9431231Z Generating XML reports... 2022-11-23T02:07:36.9431681Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014815.xml 2022-11-23T02:07:36.9432040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9432213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9432592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9432784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9432803Z 2022-11-23T02:07:36.9432907Z Running tests... 2022-11-23T02:07:36.9433166Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9433475Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9433796Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9434012Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24079 2022-11-23T02:07:36.9434214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24080 2022-11-23T02:07:36.9434592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9434767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9435342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9435534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9435910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9436090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9436456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9436630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9436950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9437197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9437599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9437998Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9438227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9438451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9438608Z skip: Skipped due to small world size. (4.274s) 2022-11-23T02:07:36.9438628Z 2022-11-23T02:07:36.9438892Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9438989Z Ran 1 test in 4.274s 2022-11-23T02:07:36.9439009Z 2022-11-23T02:07:36.9439108Z OK (skipped=1) 2022-11-23T02:07:36.9439131Z 2022-11-23T02:07:36.9439250Z Generating XML reports... 2022-11-23T02:07:36.9439694Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014822.xml 2022-11-23T02:07:36.9440057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9440231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9440614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9440814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9440834Z 2022-11-23T02:07:36.9440939Z Running tests... 2022-11-23T02:07:36.9441188Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9441494Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9441761Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9441975Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24182 2022-11-23T02:07:36.9442186Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24183 2022-11-23T02:07:36.9442558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9442729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9443186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9443363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9443731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9443902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9444278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9444467Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9444716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9444954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9445358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9445758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9445974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9446241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9446402Z skip: Skipped due to small world size. (4.243s) 2022-11-23T02:07:36.9446423Z 2022-11-23T02:07:36.9446687Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9446792Z Ran 1 test in 4.244s 2022-11-23T02:07:36.9446811Z 2022-11-23T02:07:36.9446912Z OK (skipped=1) 2022-11-23T02:07:36.9446931Z 2022-11-23T02:07:36.9447050Z Generating XML reports... 2022-11-23T02:07:36.9447495Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014829.xml 2022-11-23T02:07:36.9447876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9448037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9448415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9448605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9448624Z 2022-11-23T02:07:36.9448733Z Running tests... 2022-11-23T02:07:36.9448996Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9449301Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9449552Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9449771Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24285 2022-11-23T02:07:36.9449978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24286 2022-11-23T02:07:36.9450344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9450518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9450906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9451097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9451455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9451627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9452000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9452235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9452476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9452718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9453124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9453526Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9453750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9454084Z STAGE:2022-11-23 01:48:39 24286:24286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9454310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9454647Z STAGE:2022-11-23 01:48:39 24285:24285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9454928Z [1669168119.944219] [d8f8c46cdf70:24286:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9455199Z [1669168120.968001] [d8f8c46cdf70:24286:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9455445Z [1669168120.968001] [d8f8c46cdf70:24286:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9455720Z [1669168119.923008] [d8f8c46cdf70:24285:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9455950Z [1669168120.976084] [d8f8c46cdf70:24285:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9456193Z [1669168120.976084] [d8f8c46cdf70:24285:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9456745Z STAGE:2022-11-23 01:48:41 24286:24286 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:48:41 24285:24285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9456766Z 2022-11-23T02:07:36.9457118Z STAGE:2022-11-23 01:48:41 24286:24286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9457460Z STAGE:2022-11-23 01:48:41 24285:24285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9457780Z STAGE:2022-11-23 01:48:41 24286:24286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9458096Z STAGE:2022-11-23 01:48:41 24285:24285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9458417Z STAGE:2022-11-23 01:48:41 24286:24286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9458976Z STAGE:2022-11-23 01:48:41 24286:24286 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:48:41 24285:24285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9458996Z 2022-11-23T02:07:36.9459337Z STAGE:2022-11-23 01:48:41 24285:24285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9459435Z ok (5.862s) 2022-11-23T02:07:36.9459455Z 2022-11-23T02:07:36.9459716Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9459823Z Ran 1 test in 5.862s 2022-11-23T02:07:36.9459842Z 2022-11-23T02:07:36.9459933Z OK 2022-11-23T02:07:36.9459952Z 2022-11-23T02:07:36.9460069Z Generating XML reports... 2022-11-23T02:07:36.9460511Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014836.xml 2022-11-23T02:07:36.9460878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9461098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9461479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9461670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9461690Z 2022-11-23T02:07:36.9461796Z Running tests... 2022-11-23T02:07:36.9462058Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9462363Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9462615Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9462832Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24399 2022-11-23T02:07:36.9463033Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24400 2022-11-23T02:07:36.9463406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9463578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9464023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9464217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9464589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9464764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9465141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9465314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9465559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9465800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9466194Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9466595Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9466822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9467150Z STAGE:2022-11-23 01:48:48 24400:24400 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9467373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9467700Z STAGE:2022-11-23 01:48:48 24399:24399 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9467987Z [1669168128.301052] [d8f8c46cdf70:24400:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9468210Z [1669168129.322392] [d8f8c46cdf70:24400:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9468449Z [1669168129.322392] [d8f8c46cdf70:24400:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9468717Z [1669168128.279646] [d8f8c46cdf70:24399:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9468943Z [1669168129.329915] [d8f8c46cdf70:24399:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9469176Z [1669168129.329915] [d8f8c46cdf70:24399:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9469783Z STAGE:2022-11-23 01:48:49 24400:24400 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:48:49 24399:24399 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9469804Z 2022-11-23T02:07:36.9470158Z STAGE:2022-11-23 01:48:49 24400:24400 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9470506Z STAGE:2022-11-23 01:48:49 24399:24399 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9470828Z STAGE:2022-11-23 01:48:49 24399:24399 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9471153Z STAGE:2022-11-23 01:48:49 24400:24400 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9471473Z STAGE:2022-11-23 01:48:49 24399:24399 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9471799Z STAGE:2022-11-23 01:48:49 24400:24400 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9472199Z STAGE:2022-11-23 01:48:49 24399:24399 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9472547Z STAGE:2022-11-23 01:48:49 24400:24400 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9472645Z ok (5.810s) 2022-11-23T02:07:36.9472665Z 2022-11-23T02:07:36.9472979Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9473098Z Ran 1 test in 5.810s 2022-11-23T02:07:36.9473118Z 2022-11-23T02:07:36.9473203Z OK 2022-11-23T02:07:36.9473222Z 2022-11-23T02:07:36.9473343Z Generating XML reports... 2022-11-23T02:07:36.9473784Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014844.xml 2022-11-23T02:07:36.9474155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9474338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9474719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9474908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9474928Z 2022-11-23T02:07:36.9475221Z Running tests... 2022-11-23T02:07:36.9475500Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9475816Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9476066Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9476280Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24513 2022-11-23T02:07:36.9476489Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24514 2022-11-23T02:07:36.9476865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9477037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9477415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9477608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9477967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9478136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9478498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9478689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9478927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9479261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9479667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9480064Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9480294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9480523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9480900Z STAGE:2022-11-23 01:48:57 24514:24514 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9481663Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:07:36.9481777Z warnings.warn( 2022-11-23T02:07:36.9482110Z STAGE:2022-11-23 01:48:57 24513:24513 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9482938Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:07:36.9483054Z warnings.warn( 2022-11-23T02:07:36.9483332Z [1669168137.615397] [d8f8c46cdf70:24513:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9483568Z [1669168137.624796] [d8f8c46cdf70:24513:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9483808Z [1669168137.624796] [d8f8c46cdf70:24513:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9484155Z STAGE:2022-11-23 01:48:58 24513:24513 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9484433Z [1669168137.621904] [d8f8c46cdf70:24514:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9484651Z [1669168137.631056] [d8f8c46cdf70:24514:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9484889Z [1669168137.631056] [d8f8c46cdf70:24514:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9485223Z STAGE:2022-11-23 01:48:58 24514:24514 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9485568Z STAGE:2022-11-23 01:48:58 24513:24513 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9485919Z STAGE:2022-11-23 01:48:58 24514:24514 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9486244Z STAGE:2022-11-23 01:48:58 24514:24514 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9486566Z STAGE:2022-11-23 01:48:58 24513:24513 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9486897Z STAGE:2022-11-23 01:48:58 24514:24514 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9487241Z STAGE:2022-11-23 01:48:58 24514:24514 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9487560Z STAGE:2022-11-23 01:48:58 24513:24513 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9487897Z STAGE:2022-11-23 01:48:58 24513:24513 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9487997Z ok (5.931s) 2022-11-23T02:07:36.9488070Z 2022-11-23T02:07:36.9488337Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9488447Z Ran 1 test in 5.931s 2022-11-23T02:07:36.9488467Z 2022-11-23T02:07:36.9488557Z OK 2022-11-23T02:07:36.9488576Z 2022-11-23T02:07:36.9488691Z Generating XML reports... 2022-11-23T02:07:36.9489144Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014852.xml 2022-11-23T02:07:36.9489513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9489675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9490058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9490254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9490273Z 2022-11-23T02:07:36.9490379Z Running tests... 2022-11-23T02:07:36.9490635Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9490941Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9491217Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9491485Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24631 2022-11-23T02:07:36.9491698Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24632 2022-11-23T02:07:36.9492068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9492240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9492615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9492809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9493173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9493344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9493721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9493906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9494140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9494377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9494777Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9495172Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9495401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9495624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9495961Z STAGE:2022-11-23 01:49:06 24632:24632 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9496727Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:07:36.9496838Z warnings.warn( 2022-11-23T02:07:36.9497159Z STAGE:2022-11-23 01:49:06 24631:24631 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9497925Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:07:36.9498091Z warnings.warn( 2022-11-23T02:07:36.9498378Z [1669168146.127465] [d8f8c46cdf70:24632:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9498612Z [1669168146.136585] [d8f8c46cdf70:24632:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9498854Z [1669168146.136585] [d8f8c46cdf70:24632:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9499199Z STAGE:2022-11-23 01:49:06 24632:24632 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9499475Z [1669168146.122952] [d8f8c46cdf70:24631:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9499709Z [1669168146.132560] [d8f8c46cdf70:24631:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9500003Z [1669168146.132560] [d8f8c46cdf70:24631:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9500352Z STAGE:2022-11-23 01:49:06 24631:24631 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9500686Z STAGE:2022-11-23 01:49:06 24632:24632 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9501038Z STAGE:2022-11-23 01:49:06 24631:24631 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9501368Z STAGE:2022-11-23 01:49:06 24631:24631 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9501694Z STAGE:2022-11-23 01:49:06 24632:24632 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9502036Z STAGE:2022-11-23 01:49:06 24631:24631 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9502383Z STAGE:2022-11-23 01:49:06 24631:24631 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9502721Z STAGE:2022-11-23 01:49:06 24632:24632 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9503063Z STAGE:2022-11-23 01:49:06 24632:24632 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9503162Z ok (6.054s) 2022-11-23T02:07:36.9503182Z 2022-11-23T02:07:36.9503433Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9503548Z Ran 1 test in 6.054s 2022-11-23T02:07:36.9503568Z 2022-11-23T02:07:36.9503662Z OK 2022-11-23T02:07:36.9503681Z 2022-11-23T02:07:36.9503802Z Generating XML reports... 2022-11-23T02:07:36.9504254Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014901.xml 2022-11-23T02:07:36.9504636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9504810Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9505192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9505370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9505400Z 2022-11-23T02:07:36.9505493Z Running tests... 2022-11-23T02:07:36.9505754Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9506067Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9506326Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9506601Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24749 2022-11-23T02:07:36.9506820Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24750 2022-11-23T02:07:36.9507195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9507374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9507746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9507938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9508305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9508478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9508848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9509039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9509280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9509525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9509961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9510370Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9510592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9510933Z STAGE:2022-11-23 01:49:13 24750:24750 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9511164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9511504Z STAGE:2022-11-23 01:49:13 24749:24749 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9511786Z [1669168153.764922] [d8f8c46cdf70:24750:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9512020Z [1669168154.797733] [d8f8c46cdf70:24750:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9512261Z [1669168154.797733] [d8f8c46cdf70:24750:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9512539Z [1669168153.742799] [d8f8c46cdf70:24749:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9512759Z [1669168154.802249] [d8f8c46cdf70:24749:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9513008Z [1669168154.802249] [d8f8c46cdf70:24749:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9513564Z STAGE:2022-11-23 01:49:15 24750:24750 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:15 24749:24749 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9513588Z 2022-11-23T02:07:36.9513942Z STAGE:2022-11-23 01:49:15 24749:24749 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9514288Z STAGE:2022-11-23 01:49:15 24750:24750 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9514616Z STAGE:2022-11-23 01:49:15 24750:24750 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9514946Z STAGE:2022-11-23 01:49:15 24749:24749 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9515567Z STAGE:2022-11-23 01:49:15 24750:24750 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9515900Z STAGE:2022-11-23 01:49:15 24749:24749 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9516243Z STAGE:2022-11-23 01:49:15 24750:24750 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9516579Z STAGE:2022-11-23 01:49:15 24749:24749 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9516680Z ok (5.864s) 2022-11-23T02:07:36.9516702Z 2022-11-23T02:07:36.9516970Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9517079Z Ran 1 test in 5.864s 2022-11-23T02:07:36.9517099Z 2022-11-23T02:07:36.9517189Z OK 2022-11-23T02:07:36.9517208Z 2022-11-23T02:07:36.9517332Z Generating XML reports... 2022-11-23T02:07:36.9517784Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014909.xml 2022-11-23T02:07:36.9518161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9518322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9518707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9518973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9518995Z 2022-11-23T02:07:36.9519107Z Running tests... 2022-11-23T02:07:36.9519371Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9519676Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9519945Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9520166Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24863 2022-11-23T02:07:36.9520387Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24864 2022-11-23T02:07:36.9520750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9520923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9521312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9521503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9521867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9522034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9522413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9522612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9522844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9523081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9523486Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9523888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9524117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9524341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9524617Z [1669168162.974488] [d8f8c46cdf70:24864:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9524928Z [1669168162.979925] [d8f8c46cdf70:24864:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9525167Z [1669168162.979925] [d8f8c46cdf70:24864:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9525444Z [1669168162.972353] [d8f8c46cdf70:24863:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9525663Z [1669168162.978499] [d8f8c46cdf70:24863:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9525901Z [1669168162.978499] [d8f8c46cdf70:24863:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9526003Z ok (5.669s) 2022-11-23T02:07:36.9526024Z 2022-11-23T02:07:36.9526296Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9526408Z Ran 1 test in 5.669s 2022-11-23T02:07:36.9526428Z 2022-11-23T02:07:36.9526517Z OK 2022-11-23T02:07:36.9526536Z 2022-11-23T02:07:36.9526656Z Generating XML reports... 2022-11-23T02:07:36.9527105Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014918.xml 2022-11-23T02:07:36.9527590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9527761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9528144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9528334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9528353Z 2022-11-23T02:07:36.9528460Z Running tests... 2022-11-23T02:07:36.9528723Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9529040Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9529295Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9529510Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24977 2022-11-23T02:07:36.9529717Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24978 2022-11-23T02:07:36.9530092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9530264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9530646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9530837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9531203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9531380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9531752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9531943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9532177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9532418Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9532825Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9533219Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9533506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9533848Z STAGE:2022-11-23 01:49:30 24978:24978 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9534078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9534412Z STAGE:2022-11-23 01:49:30 24977:24977 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9534691Z [1669168170.427303] [d8f8c46cdf70:24978:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9534912Z [1669168171.456768] [d8f8c46cdf70:24978:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9535156Z [1669168171.456768] [d8f8c46cdf70:24978:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9535433Z [1669168170.405942] [d8f8c46cdf70:24977:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9535671Z [1669168171.464825] [d8f8c46cdf70:24977:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9535953Z [1669168171.464825] [d8f8c46cdf70:24977:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9536518Z STAGE:2022-11-23 01:49:31 24978:24978 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:31 24977:24977 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9536539Z 2022-11-23T02:07:36.9536892Z STAGE:2022-11-23 01:49:31 24978:24978 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9537236Z STAGE:2022-11-23 01:49:31 24977:24977 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9537572Z STAGE:2022-11-23 01:49:31 24978:24978 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9537899Z STAGE:2022-11-23 01:49:31 24977:24977 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9538220Z STAGE:2022-11-23 01:49:31 24978:24978 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9538782Z STAGE:2022-11-23 01:49:31 24978:24978 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:49:31 24977:24977 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9538818Z 2022-11-23T02:07:36.9539152Z STAGE:2022-11-23 01:49:31 24977:24977 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9539255Z ok (6.044s) 2022-11-23T02:07:36.9539275Z 2022-11-23T02:07:36.9539542Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9539655Z Ran 1 test in 6.044s 2022-11-23T02:07:36.9539674Z 2022-11-23T02:07:36.9539773Z OK 2022-11-23T02:07:36.9539792Z 2022-11-23T02:07:36.9539911Z Generating XML reports... 2022-11-23T02:07:36.9540360Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014926.xml 2022-11-23T02:07:36.9540732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9540899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9541286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9541480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9541499Z 2022-11-23T02:07:36.9541607Z Running tests... 2022-11-23T02:07:36.9541871Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9542182Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9542500Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9542718Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25091 2022-11-23T02:07:36.9542920Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25092 2022-11-23T02:07:36.9543299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9543476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9543857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9544046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9544415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9544595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9544964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9545155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9545435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9545689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9546092Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9546485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9546717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9547058Z STAGE:2022-11-23 01:49:38 25091:25091 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9547281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9547607Z STAGE:2022-11-23 01:49:38 25092:25092 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9547889Z [1669168178.929885] [d8f8c46cdf70:25091:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9548112Z [1669168179.977624] [d8f8c46cdf70:25091:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9548352Z [1669168179.977624] [d8f8c46cdf70:25091:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9548626Z [1669168178.950774] [d8f8c46cdf70:25092:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9548860Z [1669168179.989710] [d8f8c46cdf70:25092:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9549106Z [1669168179.989710] [d8f8c46cdf70:25092:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9549666Z STAGE:2022-11-23 01:49:40 25091:25091 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:40 25092:25092 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9549688Z 2022-11-23T02:07:36.9550263Z STAGE:2022-11-23 01:49:40 25091:25091 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:49:40 25092:25092 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9550284Z 2022-11-23T02:07:36.9550616Z STAGE:2022-11-23 01:49:40 25091:25091 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9551004Z STAGE:2022-11-23 01:49:40 25092:25092 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9551342Z STAGE:2022-11-23 01:49:40 25091:25091 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9551900Z STAGE:2022-11-23 01:49:40 25092:25092 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:40 25091:25091 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9551921Z 2022-11-23T02:07:36.9552267Z STAGE:2022-11-23 01:49:40 25092:25092 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9552355Z ok (5.853s) 2022-11-23T02:07:36.9552374Z 2022-11-23T02:07:36.9552641Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9552749Z Ran 1 test in 5.853s 2022-11-23T02:07:36.9552769Z 2022-11-23T02:07:36.9552861Z OK 2022-11-23T02:07:36.9552880Z 2022-11-23T02:07:36.9553000Z Generating XML reports... 2022-11-23T02:07:36.9553449Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014935.xml 2022-11-23T02:07:36.9553821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9553992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9554424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9554613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9554634Z 2022-11-23T02:07:36.9554741Z Running tests... 2022-11-23T02:07:36.9555003Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9555531Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9555796Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9556021Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25205 2022-11-23T02:07:36.9556237Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25206 2022-11-23T02:07:36.9556609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9556775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9557156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9557346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9557712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9557890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9558269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9558463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9558710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9558956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9559347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9559748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9559982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9560320Z STAGE:2022-11-23 01:49:47 25205:25205 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9560629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9560966Z STAGE:2022-11-23 01:49:47 25206:25206 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9561247Z [1669168187.359289] [d8f8c46cdf70:25205:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9561482Z [1669168188.406596] [d8f8c46cdf70:25205:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9561730Z [1669168188.406596] [d8f8c46cdf70:25205:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9561991Z [1669168187.360985] [d8f8c46cdf70:25206:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9562222Z [1669168188.422978] [d8f8c46cdf70:25206:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9562464Z [1669168188.422978] [d8f8c46cdf70:25206:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9563079Z STAGE:2022-11-23 01:49:48 25205:25205 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:48 25206:25206 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9563104Z 2022-11-23T02:07:36.9563471Z STAGE:2022-11-23 01:49:48 25206:25206 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9563816Z STAGE:2022-11-23 01:49:48 25205:25205 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9564144Z STAGE:2022-11-23 01:49:48 25206:25206 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9564469Z STAGE:2022-11-23 01:49:48 25205:25205 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9564809Z STAGE:2022-11-23 01:49:48 25206:25206 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9565141Z STAGE:2022-11-23 01:49:48 25205:25205 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9565474Z STAGE:2022-11-23 01:49:48 25206:25206 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9565824Z STAGE:2022-11-23 01:49:48 25205:25205 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9565925Z ok (5.869s) 2022-11-23T02:07:36.9565945Z 2022-11-23T02:07:36.9566207Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9566321Z Ran 1 test in 5.870s 2022-11-23T02:07:36.9566340Z 2022-11-23T02:07:36.9566429Z OK 2022-11-23T02:07:36.9566448Z 2022-11-23T02:07:36.9566572Z Generating XML reports... 2022-11-23T02:07:36.9567023Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014943.xml 2022-11-23T02:07:36.9567405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9567567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9567953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9568140Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9568159Z 2022-11-23T02:07:36.9568264Z Running tests... 2022-11-23T02:07:36.9568528Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9568840Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9569139Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T02:07:36.9569212Z 2022-11-23T02:07:36.9569478Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9569589Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9569609Z 2022-11-23T02:07:36.9569701Z OK (skipped=1) 2022-11-23T02:07:36.9569720Z 2022-11-23T02:07:36.9569842Z Generating XML reports... 2022-11-23T02:07:36.9570298Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014951.xml 2022-11-23T02:07:36.9570674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9570845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9571228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9571422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9571442Z 2022-11-23T02:07:36.9571552Z Running tests... 2022-11-23T02:07:36.9571802Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9572114Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9572421Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T02:07:36.9572487Z 2022-11-23T02:07:36.9572755Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9572870Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9572889Z 2022-11-23T02:07:36.9572992Z OK (skipped=1) 2022-11-23T02:07:36.9573011Z 2022-11-23T02:07:36.9573131Z Generating XML reports... 2022-11-23T02:07:36.9573575Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014954.xml 2022-11-23T02:07:36.9573943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9574110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9574487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9574678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9574697Z 2022-11-23T02:07:36.9574808Z Running tests... 2022-11-23T02:07:36.9575072Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9575379Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9575689Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T02:07:36.9575709Z 2022-11-23T02:07:36.9575968Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9576080Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9576103Z 2022-11-23T02:07:36.9576196Z OK (skipped=1) 2022-11-23T02:07:36.9576229Z 2022-11-23T02:07:36.9576336Z Generating XML reports... 2022-11-23T02:07:36.9576779Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014956.xml 2022-11-23T02:07:36.9577150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9577325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9577701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9577887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9577906Z 2022-11-23T02:07:36.9578011Z Running tests... 2022-11-23T02:07:36.9578276Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9578570Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9578873Z test_all_to_all (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:07:36.9578893Z 2022-11-23T02:07:36.9579152Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9579262Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9579281Z 2022-11-23T02:07:36.9579393Z OK (skipped=1) 2022-11-23T02:07:36.9579412Z 2022-11-23T02:07:36.9579536Z Generating XML reports... 2022-11-23T02:07:36.9579972Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014959.xml 2022-11-23T02:07:36.9580343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9580520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9580918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9581111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9581131Z 2022-11-23T02:07:36.9581234Z Running tests... 2022-11-23T02:07:36.9581490Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9581848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9582117Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:07:36.9582137Z 2022-11-23T02:07:36.9582399Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9582509Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9582528Z 2022-11-23T02:07:36.9582620Z OK (skipped=1) 2022-11-23T02:07:36.9582653Z 2022-11-23T02:07:36.9582760Z Generating XML reports... 2022-11-23T02:07:36.9583208Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015001.xml 2022-11-23T02:07:36.9583587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9583765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9584148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9584343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9584362Z 2022-11-23T02:07:36.9584475Z Running tests... 2022-11-23T02:07:36.9584735Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9585029Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9585287Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T02:07:36.9585310Z 2022-11-23T02:07:36.9585571Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9585684Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9585703Z 2022-11-23T02:07:36.9585805Z OK (skipped=1) 2022-11-23T02:07:36.9585824Z 2022-11-23T02:07:36.9585947Z Generating XML reports... 2022-11-23T02:07:36.9586398Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015003.xml 2022-11-23T02:07:36.9586770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9586947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9587315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9587512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9587531Z 2022-11-23T02:07:36.9587638Z Running tests... 2022-11-23T02:07:36.9587961Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9588272Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9588542Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T02:07:36.9588562Z 2022-11-23T02:07:36.9588826Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9588940Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9588959Z 2022-11-23T02:07:36.9589051Z OK (skipped=1) 2022-11-23T02:07:36.9589085Z 2022-11-23T02:07:36.9589191Z Generating XML reports... 2022-11-23T02:07:36.9589642Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015006.xml 2022-11-23T02:07:36.9590016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9590195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9590578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9590768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9590788Z 2022-11-23T02:07:36.9590889Z Running tests... 2022-11-23T02:07:36.9591210Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9591513Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9591768Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:07:36.9591787Z 2022-11-23T02:07:36.9592043Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9592157Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9592176Z 2022-11-23T02:07:36.9592283Z OK (skipped=1) 2022-11-23T02:07:36.9592306Z 2022-11-23T02:07:36.9592425Z Generating XML reports... 2022-11-23T02:07:36.9592867Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015008.xml 2022-11-23T02:07:36.9593238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9593419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9593786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9593978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9593998Z 2022-11-23T02:07:36.9594102Z Running tests... 2022-11-23T02:07:36.9594363Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9594675Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9594955Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T02:07:36.9594974Z 2022-11-23T02:07:36.9595451Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9595566Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9595586Z 2022-11-23T02:07:36.9595695Z OK (skipped=1) 2022-11-23T02:07:36.9595719Z 2022-11-23T02:07:36.9595826Z Generating XML reports... 2022-11-23T02:07:36.9596278Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015011.xml 2022-11-23T02:07:36.9596653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9596827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9597202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9597476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9597496Z 2022-11-23T02:07:36.9597604Z Running tests... 2022-11-23T02:07:36.9597872Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9598169Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9598429Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:07:36.9598449Z 2022-11-23T02:07:36.9598713Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9598818Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9598838Z 2022-11-23T02:07:36.9598942Z OK (skipped=1) 2022-11-23T02:07:36.9598960Z 2022-11-23T02:07:36.9599078Z Generating XML reports... 2022-11-23T02:07:36.9599520Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015013.xml 2022-11-23T02:07:36.9599890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9600062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9600427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9600680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9600701Z 2022-11-23T02:07:36.9600812Z Running tests... 2022-11-23T02:07:36.9601069Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9601379Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9601657Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9601677Z 2022-11-23T02:07:36.9601935Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9602048Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9602068Z 2022-11-23T02:07:36.9602170Z OK (skipped=1) 2022-11-23T02:07:36.9602189Z 2022-11-23T02:07:36.9602296Z Generating XML reports... 2022-11-23T02:07:36.9602740Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015015.xml 2022-11-23T02:07:36.9603119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9603296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9603675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9603866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9603886Z 2022-11-23T02:07:36.9603992Z Running tests... 2022-11-23T02:07:36.9604260Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9604554Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9604839Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9604859Z 2022-11-23T02:07:36.9605120Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9605233Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9605252Z 2022-11-23T02:07:36.9605356Z OK (skipped=1) 2022-11-23T02:07:36.9605375Z 2022-11-23T02:07:36.9605494Z Generating XML reports... 2022-11-23T02:07:36.9605935Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015018.xml 2022-11-23T02:07:36.9606302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9606476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9606899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9607091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9607110Z 2022-11-23T02:07:36.9607219Z Running tests... 2022-11-23T02:07:36.9607481Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9607792Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9608087Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9608107Z 2022-11-23T02:07:36.9608368Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9608476Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9608495Z 2022-11-23T02:07:36.9608601Z OK (skipped=1) 2022-11-23T02:07:36.9608624Z 2022-11-23T02:07:36.9608730Z Generating XML reports... 2022-11-23T02:07:36.9609170Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015020.xml 2022-11-23T02:07:36.9609541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9609761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9610150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9610342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9610361Z 2022-11-23T02:07:36.9610470Z Running tests... 2022-11-23T02:07:36.9610727Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9611037Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9611322Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9611342Z 2022-11-23T02:07:36.9611601Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9611710Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9611729Z 2022-11-23T02:07:36.9611834Z OK (skipped=1) 2022-11-23T02:07:36.9611856Z 2022-11-23T02:07:36.9611979Z Generating XML reports... 2022-11-23T02:07:36.9612419Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015023.xml 2022-11-23T02:07:36.9612789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9612961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9613334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9613515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9613534Z 2022-11-23T02:07:36.9613643Z Running tests... 2022-11-23T02:07:36.9613902Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9614218Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9614528Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9614548Z 2022-11-23T02:07:36.9614813Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9614924Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9614944Z 2022-11-23T02:07:36.9615047Z OK (skipped=1) 2022-11-23T02:07:36.9615067Z 2022-11-23T02:07:36.9615185Z Generating XML reports... 2022-11-23T02:07:36.9615608Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015025.xml 2022-11-23T02:07:36.9616040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9616212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9616594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9616781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9616801Z 2022-11-23T02:07:36.9616906Z Running tests... 2022-11-23T02:07:36.9617160Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9617465Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9617747Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9617783Z 2022-11-23T02:07:36.9618026Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9618136Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9618155Z 2022-11-23T02:07:36.9618260Z OK (skipped=1) 2022-11-23T02:07:36.9618280Z 2022-11-23T02:07:36.9618399Z Generating XML reports... 2022-11-23T02:07:36.9618892Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015027.xml 2022-11-23T02:07:36.9619272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9619447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9619827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9620003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9620035Z 2022-11-23T02:07:36.9620133Z Running tests... 2022-11-23T02:07:36.9620396Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9620701Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9621000Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9621024Z 2022-11-23T02:07:36.9621282Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9621390Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9621409Z 2022-11-23T02:07:36.9621512Z OK (skipped=1) 2022-11-23T02:07:36.9621531Z 2022-11-23T02:07:36.9621644Z Generating XML reports... 2022-11-23T02:07:36.9622070Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015030.xml 2022-11-23T02:07:36.9622439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9622623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9623004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9623190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9623209Z 2022-11-23T02:07:36.9623320Z Running tests... 2022-11-23T02:07:36.9623583Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9623893Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9624183Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9624204Z 2022-11-23T02:07:36.9624451Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9624561Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9624634Z 2022-11-23T02:07:36.9624740Z OK (skipped=1) 2022-11-23T02:07:36.9624758Z 2022-11-23T02:07:36.9624876Z Generating XML reports... 2022-11-23T02:07:36.9625319Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015032.xml 2022-11-23T02:07:36.9625686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9625860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9626241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9626416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9626452Z 2022-11-23T02:07:36.9626545Z Running tests... 2022-11-23T02:07:36.9626805Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9627111Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9627414Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9627434Z 2022-11-23T02:07:36.9627690Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9627855Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9627877Z 2022-11-23T02:07:36.9627991Z OK (skipped=1) 2022-11-23T02:07:36.9628009Z 2022-11-23T02:07:36.9628133Z Generating XML reports... 2022-11-23T02:07:36.9628563Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015035.xml 2022-11-23T02:07:36.9628935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9629107Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9629488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9629679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9629699Z 2022-11-23T02:07:36.9629806Z Running tests... 2022-11-23T02:07:36.9630067Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9630372Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9630656Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9630676Z 2022-11-23T02:07:36.9630919Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9631025Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9631044Z 2022-11-23T02:07:36.9631151Z OK (skipped=1) 2022-11-23T02:07:36.9631170Z 2022-11-23T02:07:36.9631290Z Generating XML reports... 2022-11-23T02:07:36.9631735Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015037.xml 2022-11-23T02:07:36.9632102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9632273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9632653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9632845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9632864Z 2022-11-23T02:07:36.9632955Z Running tests... 2022-11-23T02:07:36.9633218Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9633526Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9633820Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9633890Z 2022-11-23T02:07:36.9634163Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9634275Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9634294Z 2022-11-23T02:07:36.9634399Z OK (skipped=1) 2022-11-23T02:07:36.9634418Z 2022-11-23T02:07:36.9634538Z Generating XML reports... 2022-11-23T02:07:36.9634991Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015039.xml 2022-11-23T02:07:36.9635566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9635741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9636124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9636309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9636333Z 2022-11-23T02:07:36.9636437Z Running tests... 2022-11-23T02:07:36.9636698Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9637010Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9637374Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9637398Z 2022-11-23T02:07:36.9637664Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9637759Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9637778Z 2022-11-23T02:07:36.9637880Z OK (skipped=1) 2022-11-23T02:07:36.9637899Z 2022-11-23T02:07:36.9638017Z Generating XML reports... 2022-11-23T02:07:36.9638465Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015042.xml 2022-11-23T02:07:36.9638842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9639021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9639399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9639598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9639618Z 2022-11-23T02:07:36.9639711Z Running tests... 2022-11-23T02:07:36.9639971Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9640276Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9640581Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9640601Z 2022-11-23T02:07:36.9640863Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9640980Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9641000Z 2022-11-23T02:07:36.9641105Z OK (skipped=1) 2022-11-23T02:07:36.9641124Z 2022-11-23T02:07:36.9641244Z Generating XML reports... 2022-11-23T02:07:36.9641685Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015044.xml 2022-11-23T02:07:36.9642043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9642214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9642591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9642778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9642798Z 2022-11-23T02:07:36.9642903Z Running tests... 2022-11-23T02:07:36.9643160Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9643548Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9643849Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9643869Z 2022-11-23T02:07:36.9644132Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9644231Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9644250Z 2022-11-23T02:07:36.9644357Z OK (skipped=1) 2022-11-23T02:07:36.9644376Z 2022-11-23T02:07:36.9644500Z Generating XML reports... 2022-11-23T02:07:36.9644945Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015046.xml 2022-11-23T02:07:36.9645312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9645487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9645871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9646066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9646086Z 2022-11-23T02:07:36.9646192Z Running tests... 2022-11-23T02:07:36.9646488Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9646807Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9647116Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9647136Z 2022-11-23T02:07:36.9647395Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9647506Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9647525Z 2022-11-23T02:07:36.9647629Z OK (skipped=1) 2022-11-23T02:07:36.9647652Z 2022-11-23T02:07:36.9647776Z Generating XML reports... 2022-11-23T02:07:36.9648219Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015049.xml 2022-11-23T02:07:36.9648590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9648756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9649141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9649333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9649352Z 2022-11-23T02:07:36.9649455Z Running tests... 2022-11-23T02:07:36.9649714Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9650020Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9650315Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:07:36.9650335Z 2022-11-23T02:07:36.9650594Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9650691Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9650720Z 2022-11-23T02:07:36.9650812Z OK (skipped=1) 2022-11-23T02:07:36.9650835Z 2022-11-23T02:07:36.9650956Z Generating XML reports... 2022-11-23T02:07:36.9651403Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015051.xml 2022-11-23T02:07:36.9651776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9651945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9652324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9652574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9652594Z 2022-11-23T02:07:36.9652700Z Running tests... 2022-11-23T02:07:36.9652950Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9653259Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9653566Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:07:36.9653587Z 2022-11-23T02:07:36.9653846Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9653959Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9653978Z 2022-11-23T02:07:36.9654080Z OK (skipped=1) 2022-11-23T02:07:36.9654099Z 2022-11-23T02:07:36.9654219Z Generating XML reports... 2022-11-23T02:07:36.9654663Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015054.xml 2022-11-23T02:07:36.9655038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9655199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9655632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9655828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9655848Z 2022-11-23T02:07:36.9655952Z Running tests... 2022-11-23T02:07:36.9656218Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9656522Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9656786Z test_average_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9657004Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26210 2022-11-23T02:07:36.9657214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26211 2022-11-23T02:07:36.9657586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9657763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9658144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9658333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9658692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9658863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9659242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9659430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9659663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9659904Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9660307Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9660704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9660930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9661153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9661431Z [1669168261.851719] [d8f8c46cdf70:26210:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9661723Z [1669168261.858990] [d8f8c46cdf70:26210:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9661964Z [1669168261.858990] [d8f8c46cdf70:26210:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9662241Z [1669168261.857772] [d8f8c46cdf70:26211:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9662461Z [1669168261.862841] [d8f8c46cdf70:26211:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9662700Z [1669168261.862841] [d8f8c46cdf70:26211:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9662944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9663186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9663592Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9664029Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9664140Z ok (6.159s) 2022-11-23T02:07:36.9664159Z 2022-11-23T02:07:36.9664426Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9664530Z Ran 1 test in 6.159s 2022-11-23T02:07:36.9664550Z 2022-11-23T02:07:36.9664629Z OK 2022-11-23T02:07:36.9664648Z 2022-11-23T02:07:36.9664770Z Generating XML reports... 2022-11-23T02:07:36.9665211Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015056.xml 2022-11-23T02:07:36.9665590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9665765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9666147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9666341Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9666360Z 2022-11-23T02:07:36.9666466Z Running tests... 2022-11-23T02:07:36.9666713Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9667020Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9667278Z test_backend_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9667490Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26334 2022-11-23T02:07:36.9667709Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26335 2022-11-23T02:07:36.9668082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9668256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9668632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9668823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9669177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9669348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9669717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9669909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9670266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9670513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9670924Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9671328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9671558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9671773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9671922Z skip: Need at least 3 CUDA devices (4.257s) 2022-11-23T02:07:36.9671943Z 2022-11-23T02:07:36.9672205Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9672319Z Ran 1 test in 4.258s 2022-11-23T02:07:36.9672338Z 2022-11-23T02:07:36.9672446Z OK (skipped=1) 2022-11-23T02:07:36.9672465Z 2022-11-23T02:07:36.9672589Z Generating XML reports... 2022-11-23T02:07:36.9673036Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015105.xml 2022-11-23T02:07:36.9673451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9673622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9674004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9674192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9674211Z 2022-11-23T02:07:36.9674319Z Running tests... 2022-11-23T02:07:36.9674578Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9674885Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9675430Z test_backend_group (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 3 (0.002s) 2022-11-23T02:07:36.9675452Z 2022-11-23T02:07:36.9675722Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9675840Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9675860Z 2022-11-23T02:07:36.9675952Z OK (skipped=1) 2022-11-23T02:07:36.9675971Z 2022-11-23T02:07:36.9676087Z Generating XML reports... 2022-11-23T02:07:36.9676528Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015112.xml 2022-11-23T02:07:36.9676900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9677076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9677462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9677651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9677671Z 2022-11-23T02:07:36.9677777Z Running tests... 2022-11-23T02:07:36.9678024Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9678337Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9678588Z test_barrier (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T02:07:36.9678607Z 2022-11-23T02:07:36.9678865Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9678973Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9678992Z 2022-11-23T02:07:36.9679096Z OK (skipped=1) 2022-11-23T02:07:36.9679115Z 2022-11-23T02:07:36.9679234Z Generating XML reports... 2022-11-23T02:07:36.9679766Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015114.xml 2022-11-23T02:07:36.9680140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9680301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9680684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9680881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9680931Z 2022-11-23T02:07:36.9681043Z Running tests... 2022-11-23T02:07:36.9681307Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9681617Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9681865Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9682089Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26503 2022-11-23T02:07:36.9682306Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26504 2022-11-23T02:07:36.9682664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9682899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9683296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9683488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9683852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9684024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9684398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9684589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9684821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9685065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9685474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9685872Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9686102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9686328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9686610Z [1669168281.667192] [d8f8c46cdf70:26504:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9686848Z [1669168281.673470] [d8f8c46cdf70:26504:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9687095Z [1669168281.673470] [d8f8c46cdf70:26504:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9687372Z [1669168281.663886] [d8f8c46cdf70:26503:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9687591Z [1669168281.671406] [d8f8c46cdf70:26503:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9687828Z [1669168281.671406] [d8f8c46cdf70:26503:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9687929Z ok (6.361s) 2022-11-23T02:07:36.9688001Z 2022-11-23T02:07:36.9688276Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9688387Z Ran 1 test in 6.361s 2022-11-23T02:07:36.9688406Z 2022-11-23T02:07:36.9688498Z OK 2022-11-23T02:07:36.9688518Z 2022-11-23T02:07:36.9688639Z Generating XML reports... 2022-11-23T02:07:36.9689095Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015116.xml 2022-11-23T02:07:36.9689466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9689629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9690009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9690201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9690220Z 2022-11-23T02:07:36.9690329Z Running tests... 2022-11-23T02:07:36.9690588Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9690898Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9691160Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T02:07:36.9691180Z 2022-11-23T02:07:36.9691484Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9691587Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9691621Z 2022-11-23T02:07:36.9691712Z OK (skipped=1) 2022-11-23T02:07:36.9691731Z 2022-11-23T02:07:36.9691852Z Generating XML reports... 2022-11-23T02:07:36.9692300Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015125.xml 2022-11-23T02:07:36.9692673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9692852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9693230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9693421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9693441Z 2022-11-23T02:07:36.9693549Z Running tests... 2022-11-23T02:07:36.9693802Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9694109Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9694369Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9694588Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26650 2022-11-23T02:07:36.9694800Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26651 2022-11-23T02:07:36.9695178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9695354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9695731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9695912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9696278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9696453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9696834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9697020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9697270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9697571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9697972Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9698374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9698592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9698819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9698973Z skip: Skipped due to small world size. (4.244s) 2022-11-23T02:07:36.9698992Z 2022-11-23T02:07:36.9699259Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9699373Z Ran 1 test in 4.244s 2022-11-23T02:07:36.9699392Z 2022-11-23T02:07:36.9699500Z OK (skipped=1) 2022-11-23T02:07:36.9699519Z 2022-11-23T02:07:36.9699634Z Generating XML reports... 2022-11-23T02:07:36.9700084Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015128.xml 2022-11-23T02:07:36.9700454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9700662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9701049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9701234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9701254Z 2022-11-23T02:07:36.9701361Z Running tests... 2022-11-23T02:07:36.9701622Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9701931Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9702194Z test_barrier_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T02:07:36.9702214Z 2022-11-23T02:07:36.9702475Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9702580Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9702600Z 2022-11-23T02:07:36.9702692Z OK (skipped=1) 2022-11-23T02:07:36.9702715Z 2022-11-23T02:07:36.9702833Z Generating XML reports... 2022-11-23T02:07:36.9703278Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015135.xml 2022-11-23T02:07:36.9703650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9703824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9704199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9704393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9704412Z 2022-11-23T02:07:36.9704515Z Running tests... 2022-11-23T02:07:36.9704764Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9705070Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9705331Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9705547Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26786 2022-11-23T02:07:36.9705762Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26787 2022-11-23T02:07:36.9706132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9706308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9706742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9706933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9707284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9707460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9707832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9708018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9708259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9708500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9708910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9709303Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9709533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9709794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9709959Z skip: Skipped due to small world size. (4.241s) 2022-11-23T02:07:36.9709979Z 2022-11-23T02:07:36.9710246Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9710350Z Ran 1 test in 4.241s 2022-11-23T02:07:36.9710370Z 2022-11-23T02:07:36.9710472Z OK (skipped=1) 2022-11-23T02:07:36.9710491Z 2022-11-23T02:07:36.9710610Z Generating XML reports... 2022-11-23T02:07:36.9711061Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015137.xml 2022-11-23T02:07:36.9711438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9711602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9711988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9712183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9712203Z 2022-11-23T02:07:36.9712309Z Running tests... 2022-11-23T02:07:36.9712571Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9712882Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9713162Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T02:07:36.9713186Z 2022-11-23T02:07:36.9713444Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9713551Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9713570Z 2022-11-23T02:07:36.9713663Z OK (skipped=1) 2022-11-23T02:07:36.9713682Z 2022-11-23T02:07:36.9713807Z Generating XML reports... 2022-11-23T02:07:36.9714260Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015144.xml 2022-11-23T02:07:36.9714632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9714804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9715398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9715592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9715613Z 2022-11-23T02:07:36.9715800Z Running tests... 2022-11-23T02:07:36.9716056Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9716371Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9716647Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T02:07:36.9716667Z 2022-11-23T02:07:36.9716929Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9717035Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9717054Z 2022-11-23T02:07:36.9717161Z OK (skipped=1) 2022-11-23T02:07:36.9717180Z 2022-11-23T02:07:36.9717298Z Generating XML reports... 2022-11-23T02:07:36.9717741Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015146.xml 2022-11-23T02:07:36.9718112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9718277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9718663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9718853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9718873Z 2022-11-23T02:07:36.9718977Z Running tests... 2022-11-23T02:07:36.9719315Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9719640Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9719916Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T02:07:36.9719936Z 2022-11-23T02:07:36.9720192Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9720297Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9720316Z 2022-11-23T02:07:36.9720412Z OK (skipped=1) 2022-11-23T02:07:36.9720431Z 2022-11-23T02:07:36.9720550Z Generating XML reports... 2022-11-23T02:07:36.9720998Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015149.xml 2022-11-23T02:07:36.9721377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9721555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9721933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9722126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9722146Z 2022-11-23T02:07:36.9722254Z Running tests... 2022-11-23T02:07:36.9722512Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9722808Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9723066Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T02:07:36.9723086Z 2022-11-23T02:07:36.9723341Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9723452Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9723472Z 2022-11-23T02:07:36.9723578Z OK (skipped=1) 2022-11-23T02:07:36.9723598Z 2022-11-23T02:07:36.9723720Z Generating XML reports... 2022-11-23T02:07:36.9724162Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015151.xml 2022-11-23T02:07:36.9724527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9724700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9725068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9725323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9725343Z 2022-11-23T02:07:36.9725448Z Running tests... 2022-11-23T02:07:36.9725709Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9726017Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9726285Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T02:07:36.9726305Z 2022-11-23T02:07:36.9726562Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9726672Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9726691Z 2022-11-23T02:07:36.9726782Z OK (skipped=1) 2022-11-23T02:07:36.9726817Z 2022-11-23T02:07:36.9726925Z Generating XML reports... 2022-11-23T02:07:36.9727366Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015153.xml 2022-11-23T02:07:36.9727740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9727915Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9728285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9728525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9728546Z 2022-11-23T02:07:36.9728660Z Running tests... 2022-11-23T02:07:36.9728925Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9729218Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9729490Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:07:36.9729510Z 2022-11-23T02:07:36.9729761Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9729871Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9729890Z 2022-11-23T02:07:36.9729995Z OK (skipped=1) 2022-11-23T02:07:36.9730014Z 2022-11-23T02:07:36.9730133Z Generating XML reports... 2022-11-23T02:07:36.9730575Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015156.xml 2022-11-23T02:07:36.9730949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9731123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9731486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9731677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9731697Z 2022-11-23T02:07:36.9731803Z Running tests... 2022-11-23T02:07:36.9732065Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9732371Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9732631Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T02:07:36.9732650Z 2022-11-23T02:07:36.9732912Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9733022Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9733041Z 2022-11-23T02:07:36.9733132Z OK (skipped=1) 2022-11-23T02:07:36.9733167Z 2022-11-23T02:07:36.9733273Z Generating XML reports... 2022-11-23T02:07:36.9733719Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015158.xml 2022-11-23T02:07:36.9734088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9734264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9734702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9734890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9734910Z 2022-11-23T02:07:36.9735014Z Running tests... 2022-11-23T02:07:36.9735277Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9735572Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9735846Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T02:07:36.9735866Z 2022-11-23T02:07:36.9736124Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9736233Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9736252Z 2022-11-23T02:07:36.9736357Z OK (skipped=1) 2022-11-23T02:07:36.9736376Z 2022-11-23T02:07:36.9736499Z Generating XML reports... 2022-11-23T02:07:36.9736940Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015201.xml 2022-11-23T02:07:36.9737308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9737529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9737906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9738099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9738119Z 2022-11-23T02:07:36.9738225Z Running tests... 2022-11-23T02:07:36.9738480Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9738785Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9739038Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:07:36.9739061Z 2022-11-23T02:07:36.9739315Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9739421Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9739440Z 2022-11-23T02:07:36.9739545Z OK (skipped=1) 2022-11-23T02:07:36.9739565Z 2022-11-23T02:07:36.9739676Z Generating XML reports... 2022-11-23T02:07:36.9740121Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015203.xml 2022-11-23T02:07:36.9740493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9740670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9741049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9741243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9741262Z 2022-11-23T02:07:36.9741365Z Running tests... 2022-11-23T02:07:36.9741622Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9741915Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9742181Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:07:36.9742201Z 2022-11-23T02:07:36.9742461Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9742572Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9742592Z 2022-11-23T02:07:36.9742698Z OK (skipped=1) 2022-11-23T02:07:36.9742717Z 2022-11-23T02:07:36.9742838Z Generating XML reports... 2022-11-23T02:07:36.9743278Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015205.xml 2022-11-23T02:07:36.9743704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9743881Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9744244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9744437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9744458Z 2022-11-23T02:07:36.9744566Z Running tests... 2022-11-23T02:07:36.9744822Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9745128Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9745407Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:07:36.9745427Z 2022-11-23T02:07:36.9745681Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9745793Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9745813Z 2022-11-23T02:07:36.9745918Z OK (skipped=1) 2022-11-23T02:07:36.9745937Z 2022-11-23T02:07:36.9746043Z Generating XML reports... 2022-11-23T02:07:36.9746483Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015208.xml 2022-11-23T02:07:36.9746906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9747091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9747471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9747661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9747681Z 2022-11-23T02:07:36.9747785Z Running tests... 2022-11-23T02:07:36.9748040Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9748340Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9748603Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:07:36.9748622Z 2022-11-23T02:07:36.9748878Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9748989Z Ran 1 test in 0.003s 2022-11-23T02:07:36.9749008Z 2022-11-23T02:07:36.9749112Z OK (skipped=1) 2022-11-23T02:07:36.9749131Z 2022-11-23T02:07:36.9749248Z Generating XML reports... 2022-11-23T02:07:36.9749688Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015210.xml 2022-11-23T02:07:36.9750053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9750225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9750595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9750786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9750805Z 2022-11-23T02:07:36.9750910Z Running tests... 2022-11-23T02:07:36.9751172Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9751480Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9751746Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:07:36.9751765Z 2022-11-23T02:07:36.9752023Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9752130Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9752149Z 2022-11-23T02:07:36.9752252Z OK (skipped=1) 2022-11-23T02:07:36.9752271Z 2022-11-23T02:07:36.9752378Z Generating XML reports... 2022-11-23T02:07:36.9752876Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015213.xml 2022-11-23T02:07:36.9753246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9753420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9753800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9753990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9754010Z 2022-11-23T02:07:36.9754115Z Running tests... 2022-11-23T02:07:36.9754373Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9754664Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9754908Z test_broadcast (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9755338Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27318 2022-11-23T02:07:36.9755562Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27319 2022-11-23T02:07:36.9755929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9756174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9756565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9756756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9757125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9757283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9757649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9757836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9758082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9758327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9758731Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9759122Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9759351Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9759690Z STAGE:2022-11-23 01:52:19 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9759907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9760232Z STAGE:2022-11-23 01:52:19 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9760516Z [1669168339.323386] [d8f8c46cdf70:27318:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9760755Z [1669168340.362073] [d8f8c46cdf70:27318:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9760999Z [1669168340.362073] [d8f8c46cdf70:27318:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9761271Z [1669168339.344619] [d8f8c46cdf70:27319:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9761503Z [1669168340.382106] [d8f8c46cdf70:27319:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9761814Z [1669168340.382106] [d8f8c46cdf70:27319:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9762376Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9762398Z 2022-11-23T02:07:36.9762746Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9763093Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9763406Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9763734Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9764072Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9764416Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9764747Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9765141Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9765479Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9765799Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9766116Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9766446Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9766799Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9767142Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9767477Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9767801Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9768136Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9768462Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9768808Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9769133Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9769467Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9769788Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9770125Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9770457Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9770798Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9771146Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9771469Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9771851Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9772217Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9772542Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9772884Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9773231Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9773555Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9773881Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9774212Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9774539Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9774881Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9775209Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9775581Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9775912Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9776243Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9776567Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9776912Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9777262Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9777581Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9777899Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9778222Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9778554Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9778901Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9779242Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9779568Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9779893Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9780224Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9780560Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9780902Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9781268Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9781593Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9781911Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9782241Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9782632Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9782973Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9783320Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9783643Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9783966Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9784283Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9784614Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9784965Z STAGE:2022-11-23 01:52:20 27318:27318 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9785310Z STAGE:2022-11-23 01:52:20 27319:27319 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9785409Z ok (5.869s) 2022-11-23T02:07:36.9785429Z 2022-11-23T02:07:36.9785696Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9785869Z Ran 1 test in 5.869s 2022-11-23T02:07:36.9785890Z 2022-11-23T02:07:36.9785983Z OK 2022-11-23T02:07:36.9786002Z 2022-11-23T02:07:36.9786126Z Generating XML reports... 2022-11-23T02:07:36.9786566Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015215.xml 2022-11-23T02:07:36.9786939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9787108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9787496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9787691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9787710Z 2022-11-23T02:07:36.9787811Z Running tests... 2022-11-23T02:07:36.9788069Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9788386Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9788656Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and Nccl backend supports CUDA allReduce (0.002s) 2022-11-23T02:07:36.9788692Z 2022-11-23T02:07:36.9788935Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9789046Z Ran 1 test in 0.002s 2022-11-23T02:07:36.9789065Z 2022-11-23T02:07:36.9789172Z OK (skipped=1) 2022-11-23T02:07:36.9789191Z 2022-11-23T02:07:36.9789310Z Generating XML reports... 2022-11-23T02:07:36.9789764Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015223.xml 2022-11-23T02:07:36.9790138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9790317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9790703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9790882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9790910Z 2022-11-23T02:07:36.9791003Z Running tests... 2022-11-23T02:07:36.9791268Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9791576Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9791841Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9792121Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27465 2022-11-23T02:07:36.9792337Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27466 2022-11-23T02:07:36.9792710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9792888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9793250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9793442Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9793809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9793981Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9794360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9794551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9794795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9795329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9795750Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9796151Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9796381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9796622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9796850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9797086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9797483Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9797882Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9798217Z STAGE:2022-11-23 01:52:30 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9798530Z STAGE:2022-11-23 01:52:30 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9798809Z [1669168350.208723] [d8f8c46cdf70:27466:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9799051Z [1669168351.234732] [d8f8c46cdf70:27466:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9799287Z [1669168351.234732] [d8f8c46cdf70:27466:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9799562Z [1669168350.187127] [d8f8c46cdf70:27465:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9799796Z [1669168351.252633] [d8f8c46cdf70:27465:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9800037Z [1669168351.252633] [d8f8c46cdf70:27465:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9800592Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9800677Z 2022-11-23T02:07:36.9801259Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9801280Z 2022-11-23T02:07:36.9801617Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9801934Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9802267Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9802579Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9802928Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9803278Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9803612Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9803935Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9804309Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9804648Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9804990Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9805329Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9805640Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9805969Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9806304Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9806639Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9806982Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9807323Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9807652Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9807973Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9808306Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9808853Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9808890Z 2022-11-23T02:07:36.9809220Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9809545Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9809875Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9810206Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9810527Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9810867Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9811268Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9811596Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9811922Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9812243Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9812794Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9812815Z 2022-11-23T02:07:36.9813160Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9813476Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9813799Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9814131Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9814549Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9814904Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9815247Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9815559Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9815880Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9816219Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9816555Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9816900Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9817246Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9817570Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9817889Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9818221Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9818536Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9818888Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9819231Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9819553Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9819881Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9820209Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9820756Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9820777Z 2022-11-23T02:07:36.9821118Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9821504Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9821826Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:36.9822142Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9822471Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:36.9822815Z STAGE:2022-11-23 01:52:31 27465:27465 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9823158Z STAGE:2022-11-23 01:52:31 27466:27466 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:36.9823260Z ok (5.951s) 2022-11-23T02:07:36.9823280Z 2022-11-23T02:07:36.9823545Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9823659Z Ran 1 test in 5.951s 2022-11-23T02:07:36.9823679Z 2022-11-23T02:07:36.9823770Z OK 2022-11-23T02:07:36.9823789Z 2022-11-23T02:07:36.9823909Z Generating XML reports... 2022-11-23T02:07:36.9824344Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015226.xml 2022-11-23T02:07:36.9824768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9824948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9825330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9825521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9825541Z 2022-11-23T02:07:36.9825646Z Running tests... 2022-11-23T02:07:36.9825907Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9826220Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9826461Z test_broadcast_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9826683Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27579 2022-11-23T02:07:36.9826901Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27580 2022-11-23T02:07:36.9827272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9827448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9827829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9828020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9828383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9828557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9828922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9829106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9829354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9829597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9829997Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9830391Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9830619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9830902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9831047Z skip: Skipped due to small world size. (4.257s) 2022-11-23T02:07:36.9831073Z 2022-11-23T02:07:36.9831325Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9831433Z Ran 1 test in 4.257s 2022-11-23T02:07:36.9831457Z 2022-11-23T02:07:36.9831560Z OK (skipped=1) 2022-11-23T02:07:36.9831579Z 2022-11-23T02:07:36.9831697Z Generating XML reports... 2022-11-23T02:07:36.9832144Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015234.xml 2022-11-23T02:07:36.9832516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9832692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9833070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9833252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9833289Z 2022-11-23T02:07:36.9833382Z Running tests... 2022-11-23T02:07:36.9833643Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9834001Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9834272Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9834488Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27682 2022-11-23T02:07:36.9834703Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27683 2022-11-23T02:07:36.9835350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9835539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9835915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9836101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9836471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9836647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9837022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9837212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9837459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9837704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9838096Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9838499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9838731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9838960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9839735Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:07:36.9839844Z warnings.warn( 2022-11-23T02:07:36.9840611Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:07:36.9840813Z warnings.warn( 2022-11-23T02:07:36.9841098Z [1669168366.154106] [d8f8c46cdf70:27683:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9841335Z [1669168366.160274] [d8f8c46cdf70:27683:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9841581Z [1669168366.160274] [d8f8c46cdf70:27683:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9841842Z [1669168366.150791] [d8f8c46cdf70:27682:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9842079Z [1669168366.158089] [d8f8c46cdf70:27682:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9842314Z [1669168366.158089] [d8f8c46cdf70:27682:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9842411Z ok (5.451s) 2022-11-23T02:07:36.9842431Z 2022-11-23T02:07:36.9842759Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9842880Z Ran 1 test in 5.451s 2022-11-23T02:07:36.9842900Z 2022-11-23T02:07:36.9842988Z OK 2022-11-23T02:07:36.9843007Z 2022-11-23T02:07:36.9843128Z Generating XML reports... 2022-11-23T02:07:36.9843576Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015241.xml 2022-11-23T02:07:36.9843937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9844112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9844494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9844687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9844706Z 2022-11-23T02:07:36.9844810Z Running tests... 2022-11-23T02:07:36.9845079Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9845392Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9845652Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9846396Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82847 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.598s) 2022-11-23T02:07:36.9846421Z 2022-11-23T02:07:36.9846671Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9846778Z Ran 1 test in 1.598s 2022-11-23T02:07:36.9846797Z 2022-11-23T02:07:36.9846896Z OK (skipped=1) 2022-11-23T02:07:36.9846916Z 2022-11-23T02:07:36.9847031Z Generating XML reports... 2022-11-23T02:07:36.9847475Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015249.xml 2022-11-23T02:07:36.9847844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9848020Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9848399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9848592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9848663Z 2022-11-23T02:07:36.9848762Z Running tests... 2022-11-23T02:07:36.9849023Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9849331Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9849646Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9850391Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85012 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.643s) 2022-11-23T02:07:36.9850411Z 2022-11-23T02:07:36.9850672Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9850781Z Ran 1 test in 1.643s 2022-11-23T02:07:36.9850800Z 2022-11-23T02:07:36.9850909Z OK (skipped=1) 2022-11-23T02:07:36.9850928Z 2022-11-23T02:07:36.9851048Z Generating XML reports... 2022-11-23T02:07:36.9851497Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015253.xml 2022-11-23T02:07:36.9851857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9852093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9852488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9852679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9852698Z 2022-11-23T02:07:36.9852804Z Running tests... 2022-11-23T02:07:36.9853060Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9853367Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9853691Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9854446Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85339 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.661s) 2022-11-23T02:07:36.9854467Z 2022-11-23T02:07:36.9854729Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9854825Z Ran 1 test in 1.661s 2022-11-23T02:07:36.9854844Z 2022-11-23T02:07:36.9854950Z OK (skipped=1) 2022-11-23T02:07:36.9854970Z 2022-11-23T02:07:36.9855090Z Generating XML reports... 2022-11-23T02:07:36.9855538Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015257.xml 2022-11-23T02:07:36.9855914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9856091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9856472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9856668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9856687Z 2022-11-23T02:07:36.9856780Z Running tests... 2022-11-23T02:07:36.9857040Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9857348Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9857612Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9857828Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27898 2022-11-23T02:07:36.9858098Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27899 2022-11-23T02:07:36.9858469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9858639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9859023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9859199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9859564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9859735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9860110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9860301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9860546Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9860783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9861228Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9861621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9861850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9862077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9862333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxqfatzu5 2022-11-23T02:07:36.9862610Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxqfatzu5/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9862866Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxdhpxvkg 2022-11-23T02:07:36.9863136Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxdhpxvkg/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9863415Z [1669168386.736997] [d8f8c46cdf70:27899:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9863650Z [1669168386.742453] [d8f8c46cdf70:27899:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9863879Z [1669168386.742453] [d8f8c46cdf70:27899:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9864156Z [1669168386.727287] [d8f8c46cdf70:27898:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9864395Z [1669168386.734732] [d8f8c46cdf70:27898:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9864635Z [1669168386.734732] [d8f8c46cdf70:27898:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9864871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9865113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9865213Z ok (5.970s) 2022-11-23T02:07:36.9865233Z 2022-11-23T02:07:36.9865503Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9865610Z Ran 1 test in 5.970s 2022-11-23T02:07:36.9865630Z 2022-11-23T02:07:36.9865707Z OK 2022-11-23T02:07:36.9865809Z 2022-11-23T02:07:36.9865918Z Generating XML reports... 2022-11-23T02:07:36.9866367Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015302.xml 2022-11-23T02:07:36.9866803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9866977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9867361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9867555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9867574Z 2022-11-23T02:07:36.9867680Z Running tests... 2022-11-23T02:07:36.9867936Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9868231Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9868504Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9868719Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28016 2022-11-23T02:07:36.9868934Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28017 2022-11-23T02:07:36.9869308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9869527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9869916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9870106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9870474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9870633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9871003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9871194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9871437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9871683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9872091Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9872487Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9872717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9872930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9873184Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpskbqre9g 2022-11-23T02:07:36.9873454Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpskbqre9g/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9873706Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqtgilb34 2022-11-23T02:07:36.9873973Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqtgilb34/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9874333Z [1669168395.388424] [d8f8c46cdf70:28016:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9874568Z [1669168395.394858] [d8f8c46cdf70:28016:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9874808Z [1669168395.394858] [d8f8c46cdf70:28016:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9875283Z [1669168395.389186] [d8f8c46cdf70:28017:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9875607Z [1669168395.394681] [d8f8c46cdf70:28017:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9875832Z [1669168395.394681] [d8f8c46cdf70:28017:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9876073Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9876309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9876542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9876768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9876865Z ok (6.070s) 2022-11-23T02:07:36.9876886Z 2022-11-23T02:07:36.9877160Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9877273Z Ran 1 test in 6.071s 2022-11-23T02:07:36.9877292Z 2022-11-23T02:07:36.9877369Z OK 2022-11-23T02:07:36.9877400Z 2022-11-23T02:07:36.9877509Z Generating XML reports... 2022-11-23T02:07:36.9877955Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015310.xml 2022-11-23T02:07:36.9878390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9878572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9878954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9879142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9879162Z 2022-11-23T02:07:36.9879266Z Running tests... 2022-11-23T02:07:36.9879525Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9879827Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9880102Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9880849Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78641 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.625s) 2022-11-23T02:07:36.9880870Z 2022-11-23T02:07:36.9881168Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9881278Z Ran 1 test in 1.626s 2022-11-23T02:07:36.9881297Z 2022-11-23T02:07:36.9881408Z OK (skipped=1) 2022-11-23T02:07:36.9881430Z 2022-11-23T02:07:36.9881554Z Generating XML reports... 2022-11-23T02:07:36.9882003Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015319.xml 2022-11-23T02:07:36.9882377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9882551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9882922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9883115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9883134Z 2022-11-23T02:07:36.9883236Z Running tests... 2022-11-23T02:07:36.9883500Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9883810Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9884101Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9884900Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77261 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.637s) 2022-11-23T02:07:36.9884921Z 2022-11-23T02:07:36.9885185Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9885294Z Ran 1 test in 1.637s 2022-11-23T02:07:36.9885314Z 2022-11-23T02:07:36.9885406Z OK (skipped=1) 2022-11-23T02:07:36.9885444Z 2022-11-23T02:07:36.9885550Z Generating XML reports... 2022-11-23T02:07:36.9885997Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015323.xml 2022-11-23T02:07:36.9886366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9886550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9886935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9887123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9887142Z 2022-11-23T02:07:36.9887248Z Running tests... 2022-11-23T02:07:36.9887555Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9887862Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9888146Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9888361Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28202 2022-11-23T02:07:36.9888574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28203 2022-11-23T02:07:36.9888942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9889123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9889500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9889691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9890051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9890209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9890582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9890769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9891014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9891260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9891663Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9892061Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9892287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9892501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9892751Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb_l8_jfe 2022-11-23T02:07:36.9893015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb_l8_jfe/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9893268Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphkxcslnu 2022-11-23T02:07:36.9893591Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphkxcslnu/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9893795Z 2022-11-23T02:07:36.9894075Z [1669168412.218243] [d8f8c46cdf70:28202:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9894316Z [1669168412.224998] [d8f8c46cdf70:28202:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9894557Z [1669168412.224998] [d8f8c46cdf70:28202:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9894833Z [1669168412.223887] [d8f8c46cdf70:28203:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9895052Z [1669168412.228953] [d8f8c46cdf70:28203:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9895295Z [1669168412.228953] [d8f8c46cdf70:28203:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9895396Z ok (5.554s) 2022-11-23T02:07:36.9895416Z 2022-11-23T02:07:36.9895680Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9895833Z Ran 1 test in 5.555s 2022-11-23T02:07:36.9895855Z 2022-11-23T02:07:36.9895946Z OK 2022-11-23T02:07:36.9895965Z 2022-11-23T02:07:36.9896087Z Generating XML reports... 2022-11-23T02:07:36.9896539Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015327.xml 2022-11-23T02:07:36.9896899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9897076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9897461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9897655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9897675Z 2022-11-23T02:07:36.9897780Z Running tests... 2022-11-23T02:07:36.9898039Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9898351Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9898654Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9898870Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28316 2022-11-23T02:07:36.9899072Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28317 2022-11-23T02:07:36.9899442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9899621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9899999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9900188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9900553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9900732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9901104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9901288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9901520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9901823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9902221Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9902609Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9902837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9903057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9903317Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnq8nwmzk 2022-11-23T02:07:36.9903585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnq8nwmzk/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9903825Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgasfnw_r 2022-11-23T02:07:36.9904099Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgasfnw_r/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9904377Z [1669168420.267399] [d8f8c46cdf70:28316:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9904657Z [1669168420.274030] [d8f8c46cdf70:28316:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9904910Z [1669168420.274030] [d8f8c46cdf70:28316:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9905181Z [1669168420.271097] [d8f8c46cdf70:28317:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9905410Z [1669168420.276511] [d8f8c46cdf70:28317:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9905650Z [1669168420.276511] [d8f8c46cdf70:28317:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9905749Z ok (5.508s) 2022-11-23T02:07:36.9905769Z 2022-11-23T02:07:36.9906039Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9906135Z Ran 1 test in 5.508s 2022-11-23T02:07:36.9906154Z 2022-11-23T02:07:36.9906245Z OK 2022-11-23T02:07:36.9906269Z 2022-11-23T02:07:36.9906388Z Generating XML reports... 2022-11-23T02:07:36.9906831Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015335.xml 2022-11-23T02:07:36.9907204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9907372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9907752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9907949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9907969Z 2022-11-23T02:07:36.9908073Z Running tests... 2022-11-23T02:07:36.9908325Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9908632Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9908898Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9909114Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28430 2022-11-23T02:07:36.9909327Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28431 2022-11-23T02:07:36.9909698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9909872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9910313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9910488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9910844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9911016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9911391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9911580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9911823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9912063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9912458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9912860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9913078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9913365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9913632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbwae7_oh 2022-11-23T02:07:36.9913901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbwae7_oh/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9914153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp74zsu17k 2022-11-23T02:07:36.9914417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp74zsu17k/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9914657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9914886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9915367Z [1669168428.380646] [d8f8c46cdf70:28431:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9915602Z [1669168428.386182] [d8f8c46cdf70:28431:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9915846Z [1669168428.386182] [d8f8c46cdf70:28431:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9916120Z [1669168428.378157] [d8f8c46cdf70:28430:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9916350Z [1669168428.384948] [d8f8c46cdf70:28430:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9916591Z [1669168428.384948] [d8f8c46cdf70:28430:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9916688Z ok (6.067s) 2022-11-23T02:07:36.9916710Z 2022-11-23T02:07:36.9916986Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9917097Z Ran 1 test in 6.067s 2022-11-23T02:07:36.9917120Z 2022-11-23T02:07:36.9917214Z OK 2022-11-23T02:07:36.9917233Z 2022-11-23T02:07:36.9917341Z Generating XML reports... 2022-11-23T02:07:36.9917795Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015343.xml 2022-11-23T02:07:36.9918174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9918352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9918737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9919013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9919033Z 2022-11-23T02:07:36.9919137Z Running tests... 2022-11-23T02:07:36.9919400Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9919700Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9919987Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9920203Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28548 2022-11-23T02:07:36.9920421Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28549 2022-11-23T02:07:36.9920789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9920967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9921347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9921531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9921948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9922116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9922492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9922675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9922917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9923158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9923564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9923964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9924195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9924420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9924660Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphoyi95de 2022-11-23T02:07:36.9924929Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphoyi95de/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9925178Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7xguv8we 2022-11-23T02:07:36.9925437Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7xguv8we/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9925713Z [1669168436.944219] [d8f8c46cdf70:28549:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9925950Z [1669168436.950949] [d8f8c46cdf70:28549:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9926197Z [1669168436.950949] [d8f8c46cdf70:28549:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9926471Z [1669168436.943974] [d8f8c46cdf70:28548:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9926704Z [1669168436.950513] [d8f8c46cdf70:28548:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9926942Z [1669168436.950513] [d8f8c46cdf70:28548:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9927781Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:36.9927872Z ok (6.060s) 2022-11-23T02:07:36.9927910Z 2022-11-23T02:07:36.9928166Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9928275Z Ran 1 test in 6.061s 2022-11-23T02:07:36.9928295Z 2022-11-23T02:07:36.9928384Z OK 2022-11-23T02:07:36.9928404Z 2022-11-23T02:07:36.9928529Z Generating XML reports... 2022-11-23T02:07:36.9928982Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015352.xml 2022-11-23T02:07:36.9929360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9929536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9929963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9930148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9930180Z 2022-11-23T02:07:36.9930272Z Running tests... 2022-11-23T02:07:36.9930537Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9930848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9931126Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9931882Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78235 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T02:07:36.9931902Z 2022-11-23T02:07:36.9932168Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9932282Z Ran 1 test in 1.647s 2022-11-23T02:07:36.9932302Z 2022-11-23T02:07:36.9932404Z OK (skipped=1) 2022-11-23T02:07:36.9932423Z 2022-11-23T02:07:36.9932542Z Generating XML reports... 2022-11-23T02:07:36.9932976Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015400.xml 2022-11-23T02:07:36.9933347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9933528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9933908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9934099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9934118Z 2022-11-23T02:07:36.9934230Z Running tests... 2022-11-23T02:07:36.9934500Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9934815Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9935058Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9935280Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28700 2022-11-23T02:07:36.9935493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28701 2022-11-23T02:07:36.9935863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9936094Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9936480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9936675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9937042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9937216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9937575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9937765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9938008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9938254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9938657Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9939097Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9939328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9939587Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1hvgasqh 2022-11-23T02:07:36.9939859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1hvgasqh/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9940086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9940340Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw4wrfgff 2022-11-23T02:07:36.9940614Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw4wrfgff/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9940897Z [1669168449.044579] [d8f8c46cdf70:28701:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9941125Z [1669168449.810442] [d8f8c46cdf70:28701:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9941369Z [1669168449.810442] [d8f8c46cdf70:28701:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9942263Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9942540Z [1669168449.023088] [d8f8c46cdf70:28700:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9942771Z [1669168449.811280] [d8f8c46cdf70:28700:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9943008Z [1669168449.811280] [d8f8c46cdf70:28700:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9943891Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9945055Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T02:07:36.9945350Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:07:36.9946505Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T02:07:36.9946739Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:07:36.9946972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9947247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9948147Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9949031Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9949912Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9950788Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9951668Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9952536Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9953401Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9954392Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9955528Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9956416Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:07:36.9956520Z ok (5.557s) 2022-11-23T02:07:36.9956614Z 2022-11-23T02:07:36.9956894Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9957007Z Ran 1 test in 5.557s 2022-11-23T02:07:36.9957027Z 2022-11-23T02:07:36.9957120Z OK 2022-11-23T02:07:36.9957139Z 2022-11-23T02:07:36.9957248Z Generating XML reports... 2022-11-23T02:07:36.9957697Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015405.xml 2022-11-23T02:07:36.9958072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9958255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9958640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9958833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9958853Z 2022-11-23T02:07:36.9958967Z Running tests... 2022-11-23T02:07:36.9959232Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9959529Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9959777Z test_ddp_device (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9960519Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77324 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.621s) 2022-11-23T02:07:36.9960544Z 2022-11-23T02:07:36.9960807Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9960921Z Ran 1 test in 1.621s 2022-11-23T02:07:36.9960940Z 2022-11-23T02:07:36.9961040Z OK (skipped=1) 2022-11-23T02:07:36.9961060Z 2022-11-23T02:07:36.9961183Z Generating XML reports... 2022-11-23T02:07:36.9961631Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015413.xml 2022-11-23T02:07:36.9962005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9962180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9962546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9962809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9962829Z 2022-11-23T02:07:36.9962933Z Running tests... 2022-11-23T02:07:36.9963199Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9963510Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9963783Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9964001Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28848 2022-11-23T02:07:36.9964216Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28849 2022-11-23T02:07:36.9964586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9964747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9965126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9965317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9965683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9965855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9966272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9966465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9966706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9966934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9967340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9967743Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9967975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9968203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9968456Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ir3nf9x 2022-11-23T02:07:36.9968730Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ir3nf9x/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9968977Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdoejepw3 2022-11-23T02:07:36.9969244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdoejepw3/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9970020Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T02:07:36.9970358Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T02:07:36.9971148Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T02:07:36.9971482Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T02:07:36.9971760Z [1669168461.947384] [d8f8c46cdf70:28849:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9972051Z [1669168461.953250] [d8f8c46cdf70:28849:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9972289Z [1669168461.953250] [d8f8c46cdf70:28849:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9972566Z [1669168461.946560] [d8f8c46cdf70:28848:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9972803Z [1669168461.953523] [d8f8c46cdf70:28848:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9973045Z [1669168461.953523] [d8f8c46cdf70:28848:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9973150Z ok (5.957s) 2022-11-23T02:07:36.9973170Z 2022-11-23T02:07:36.9973434Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9973536Z Ran 1 test in 5.957s 2022-11-23T02:07:36.9973556Z 2022-11-23T02:07:36.9973645Z OK 2022-11-23T02:07:36.9973664Z 2022-11-23T02:07:36.9973788Z Generating XML reports... 2022-11-23T02:07:36.9974239Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015417.xml 2022-11-23T02:07:36.9974678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9974860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9975242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9975433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9975454Z 2022-11-23T02:07:36.9975547Z Running tests... 2022-11-23T02:07:36.9975809Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9976121Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9976389Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9977136Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78685 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.631s) 2022-11-23T02:07:36.9977157Z 2022-11-23T02:07:36.9977418Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9977529Z Ran 1 test in 1.631s 2022-11-23T02:07:36.9977548Z 2022-11-23T02:07:36.9977655Z OK (skipped=1) 2022-11-23T02:07:36.9977674Z 2022-11-23T02:07:36.9977795Z Generating XML reports... 2022-11-23T02:07:36.9978247Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015425.xml 2022-11-23T02:07:36.9978608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9978782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9979171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9979360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9979379Z 2022-11-23T02:07:36.9979486Z Running tests... 2022-11-23T02:07:36.9979750Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9980058Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9980324Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9981174Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77293 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.656s) 2022-11-23T02:07:36.9981196Z 2022-11-23T02:07:36.9981462Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9981559Z Ran 1 test in 1.656s 2022-11-23T02:07:36.9981578Z 2022-11-23T02:07:36.9981681Z OK (skipped=1) 2022-11-23T02:07:36.9981700Z 2022-11-23T02:07:36.9981822Z Generating XML reports... 2022-11-23T02:07:36.9982271Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015429.xml 2022-11-23T02:07:36.9982643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9982830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9983215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9983408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9983428Z 2022-11-23T02:07:36.9983519Z Running tests... 2022-11-23T02:07:36.9983830Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9984154Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9984445Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9984666Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29064 2022-11-23T02:07:36.9984883Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29065 2022-11-23T02:07:36.9987071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9987250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9987631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9987811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9988180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9988353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9988727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9988911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9989148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:36.9989389Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:36.9989792Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9990195Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:36.9990411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:36.9990648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:36.9990874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:36.9991108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:36.9991505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9991964Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:36.9992220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps24hk2lc 2022-11-23T02:07:36.9992493Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps24hk2lc/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9992743Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ut80ut2 2022-11-23T02:07:36.9992995Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ut80ut2/_remote_module_non_scriptable.py 2022-11-23T02:07:36.9993272Z [1669168478.896083] [d8f8c46cdf70:29064:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9993502Z [1669168478.903322] [d8f8c46cdf70:29064:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9993741Z [1669168478.903322] [d8f8c46cdf70:29064:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9994014Z [1669168478.900944] [d8f8c46cdf70:29065:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:36.9994286Z [1669168478.907967] [d8f8c46cdf70:29065:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:36.9994534Z [1669168478.907967] [d8f8c46cdf70:29065:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:36.9994765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9994990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9995441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9995670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:36.9995768Z ok (6.268s) 2022-11-23T02:07:36.9995789Z 2022-11-23T02:07:36.9996064Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9996175Z Ran 1 test in 6.268s 2022-11-23T02:07:36.9996198Z 2022-11-23T02:07:36.9996290Z OK 2022-11-23T02:07:36.9996309Z 2022-11-23T02:07:36.9996426Z Generating XML reports... 2022-11-23T02:07:36.9996878Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015434.xml 2022-11-23T02:07:36.9997251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:36.9997415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:36.9997796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:36.9997994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:36.9998013Z 2022-11-23T02:07:36.9998117Z Running tests... 2022-11-23T02:07:36.9998380Z ---------------------------------------------------------------------- 2022-11-23T02:07:36.9998690Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:36.9998963Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:36.9999177Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29182 2022-11-23T02:07:36.9999379Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29183 2022-11-23T02:07:36.9999750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0000008Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0000388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0000572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0000939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0001114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0001487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0001676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0001905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0002143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0002543Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0002938Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0003226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0003510Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:07:37.0003738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0004011Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:07:37.0004263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn72lml_8 2022-11-23T02:07:37.0004519Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn72lml_8/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0004770Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc_3fkxul 2022-11-23T02:07:37.0005034Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc_3fkxul/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0005312Z [1669168487.701355] [d8f8c46cdf70:29183:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0005549Z [1669168487.707600] [d8f8c46cdf70:29183:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0005789Z [1669168487.707600] [d8f8c46cdf70:29183:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0006059Z [1669168487.693784] [d8f8c46cdf70:29182:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0006296Z [1669168487.700977] [d8f8c46cdf70:29182:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0006532Z [1669168487.700977] [d8f8c46cdf70:29182:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0006767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0006986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0007217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0007443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0007718Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:07:37.0007990Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:07:37.0008316Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:07:37.0008586Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:07:37.0008821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0009049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0009269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0009495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0009772Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:07:37.0010045Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:07:37.0010321Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T02:07:37.0010585Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T02:07:37.0010861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0011098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0011313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0011542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0011639Z ok (6.663s) 2022-11-23T02:07:37.0011660Z 2022-11-23T02:07:37.0011929Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0012037Z Ran 1 test in 6.663s 2022-11-23T02:07:37.0012057Z 2022-11-23T02:07:37.0012143Z OK 2022-11-23T02:07:37.0012162Z 2022-11-23T02:07:37.0012282Z Generating XML reports... 2022-11-23T02:07:37.0012729Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015442.xml 2022-11-23T02:07:37.0013106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0013271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0013646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0013831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0013851Z 2022-11-23T02:07:37.0013955Z Running tests... 2022-11-23T02:07:37.0014217Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0014529Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0014794Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0015548Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77378 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.607s) 2022-11-23T02:07:37.0015569Z 2022-11-23T02:07:37.0015826Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0015923Z Ran 1 test in 1.607s 2022-11-23T02:07:37.0015960Z 2022-11-23T02:07:37.0016051Z OK (skipped=1) 2022-11-23T02:07:37.0016070Z 2022-11-23T02:07:37.0016196Z Generating XML reports... 2022-11-23T02:07:37.0016639Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015452.xml 2022-11-23T02:07:37.0017066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0017236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0017614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0017805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0017825Z 2022-11-23T02:07:37.0017926Z Running tests... 2022-11-23T02:07:37.0018174Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0018485Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0018755Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0018975Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29334 2022-11-23T02:07:37.0019192Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29335 2022-11-23T02:07:37.0019563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0019738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0020166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0020363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0020715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0020881Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0021247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0021438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0021683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0021926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0022332Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0022728Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0022945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0023487Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:07:37.0023719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0024252Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:07:37.0024505Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2so_7cg8 2022-11-23T02:07:37.0024772Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2so_7cg8/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0025025Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf7uv1z2c 2022-11-23T02:07:37.0025346Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf7uv1z2c/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0025575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0025803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0026077Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T02:07:37.0026338Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T02:07:37.0026633Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T02:07:37.0026956Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T02:07:37.0027281Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T02:07:37.0027573Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T02:07:37.0027931Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T02:07:37.0028251Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T02:07:37.0028531Z [1669168500.944101] [d8f8c46cdf70:29334:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0028770Z [1669168500.951983] [d8f8c46cdf70:29334:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0029012Z [1669168500.951983] [d8f8c46cdf70:29334:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0029286Z [1669168500.948310] [d8f8c46cdf70:29335:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0029506Z [1669168500.954254] [d8f8c46cdf70:29335:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0029739Z [1669168500.954254] [d8f8c46cdf70:29335:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0029969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0030200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0030297Z ok (6.014s) 2022-11-23T02:07:37.0030317Z 2022-11-23T02:07:37.0030580Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0030689Z Ran 1 test in 6.014s 2022-11-23T02:07:37.0030709Z 2022-11-23T02:07:37.0030798Z OK 2022-11-23T02:07:37.0030818Z 2022-11-23T02:07:37.0030934Z Generating XML reports... 2022-11-23T02:07:37.0031372Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015456.xml 2022-11-23T02:07:37.0031751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0031926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0032307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0032497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0032517Z 2022-11-23T02:07:37.0032622Z Running tests... 2022-11-23T02:07:37.0032877Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0033246Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0033627Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0033657Z 2022-11-23T02:07:37.0033909Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0034021Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0034040Z 2022-11-23T02:07:37.0034142Z OK (skipped=1) 2022-11-23T02:07:37.0034161Z 2022-11-23T02:07:37.0034277Z Generating XML reports... 2022-11-23T02:07:37.0034720Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015504.xml 2022-11-23T02:07:37.0035285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0035473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0035858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0036037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0036072Z 2022-11-23T02:07:37.0036164Z Running tests... 2022-11-23T02:07:37.0036508Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0036835Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0037225Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0037246Z 2022-11-23T02:07:37.0037509Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0037622Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0037645Z 2022-11-23T02:07:37.0037744Z OK (skipped=1) 2022-11-23T02:07:37.0037764Z 2022-11-23T02:07:37.0037879Z Generating XML reports... 2022-11-23T02:07:37.0038311Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015507.xml 2022-11-23T02:07:37.0038689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0038865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0039244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0039432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0039451Z 2022-11-23T02:07:37.0039554Z Running tests... 2022-11-23T02:07:37.0039810Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0040115Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0040567Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0040587Z 2022-11-23T02:07:37.0040850Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0040947Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0040966Z 2022-11-23T02:07:37.0041071Z OK (skipped=1) 2022-11-23T02:07:37.0041090Z 2022-11-23T02:07:37.0041205Z Generating XML reports... 2022-11-23T02:07:37.0041646Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015509.xml 2022-11-23T02:07:37.0042021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0042277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0042655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0042839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0042859Z 2022-11-23T02:07:37.0042960Z Running tests... 2022-11-23T02:07:37.0043207Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0043512Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0043951Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0043971Z 2022-11-23T02:07:37.0044231Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0044343Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0044362Z 2022-11-23T02:07:37.0044469Z OK (skipped=1) 2022-11-23T02:07:37.0044489Z 2022-11-23T02:07:37.0044605Z Generating XML reports... 2022-11-23T02:07:37.0045044Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015512.xml 2022-11-23T02:07:37.0045462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0045633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0046011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0046207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0046227Z 2022-11-23T02:07:37.0046335Z Running tests... 2022-11-23T02:07:37.0046601Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0046920Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0047373Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0047397Z 2022-11-23T02:07:37.0047664Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0047778Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0047797Z 2022-11-23T02:07:37.0047889Z OK (skipped=1) 2022-11-23T02:07:37.0047907Z 2022-11-23T02:07:37.0048033Z Generating XML reports... 2022-11-23T02:07:37.0048481Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015514.xml 2022-11-23T02:07:37.0048856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0049037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0049426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0049621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0049641Z 2022-11-23T02:07:37.0049755Z Running tests... 2022-11-23T02:07:37.0050021Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0050319Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0050767Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0050787Z 2022-11-23T02:07:37.0051058Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0051232Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0051252Z 2022-11-23T02:07:37.0051360Z OK (skipped=1) 2022-11-23T02:07:37.0051379Z 2022-11-23T02:07:37.0051504Z Generating XML reports... 2022-11-23T02:07:37.0051960Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015516.xml 2022-11-23T02:07:37.0052342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0052524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0052892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0053087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0053107Z 2022-11-23T02:07:37.0053217Z Running tests... 2022-11-23T02:07:37.0053484Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0053802Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0054299Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0054322Z 2022-11-23T02:07:37.0054592Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0054707Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0054727Z 2022-11-23T02:07:37.0054837Z OK (skipped=1) 2022-11-23T02:07:37.0054856Z 2022-11-23T02:07:37.0054962Z Generating XML reports... 2022-11-23T02:07:37.0055416Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015519.xml 2022-11-23T02:07:37.0055792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0055976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0056362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0056558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0056581Z 2022-11-23T02:07:37.0056694Z Running tests... 2022-11-23T02:07:37.0056962Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0057276Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0057706Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0057744Z 2022-11-23T02:07:37.0057994Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0058107Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0058126Z 2022-11-23T02:07:37.0058236Z OK (skipped=1) 2022-11-23T02:07:37.0058256Z 2022-11-23T02:07:37.0058381Z Generating XML reports... 2022-11-23T02:07:37.0058832Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015521.xml 2022-11-23T02:07:37.0059206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0059384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0059766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0059943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0059980Z 2022-11-23T02:07:37.0060072Z Running tests... 2022-11-23T02:07:37.0060401Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0060713Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0061164Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0061185Z 2022-11-23T02:07:37.0061449Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0061564Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0061583Z 2022-11-23T02:07:37.0061693Z OK (skipped=1) 2022-11-23T02:07:37.0061713Z 2022-11-23T02:07:37.0061839Z Generating XML reports... 2022-11-23T02:07:37.0062289Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015523.xml 2022-11-23T02:07:37.0062652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0062835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0063219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0063464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0063486Z 2022-11-23T02:07:37.0063600Z Running tests... 2022-11-23T02:07:37.0063871Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0064186Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0064631Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0064656Z 2022-11-23T02:07:37.0064919Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0065016Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0065035Z 2022-11-23T02:07:37.0065146Z OK (skipped=1) 2022-11-23T02:07:37.0065165Z 2022-11-23T02:07:37.0065339Z Generating XML reports... 2022-11-23T02:07:37.0065840Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015526.xml 2022-11-23T02:07:37.0066363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0066638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0067011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0067255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0067275Z 2022-11-23T02:07:37.0067430Z Running tests... 2022-11-23T02:07:37.0067732Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0068085Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0068519Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0068541Z 2022-11-23T02:07:37.0068845Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0069042Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0069062Z 2022-11-23T02:07:37.0069219Z OK (skipped=1) 2022-11-23T02:07:37.0069239Z 2022-11-23T02:07:37.0069402Z Generating XML reports... 2022-11-23T02:07:37.0069843Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015528.xml 2022-11-23T02:07:37.0070253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0070528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0070953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0071188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0071209Z 2022-11-23T02:07:37.0071367Z Running tests... 2022-11-23T02:07:37.0071756Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0072163Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0072540Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:07:37.0072613Z 2022-11-23T02:07:37.0072871Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0073023Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0073044Z 2022-11-23T02:07:37.0073189Z OK (skipped=1) 2022-11-23T02:07:37.0073208Z 2022-11-23T02:07:37.0073368Z Generating XML reports... 2022-11-23T02:07:37.0073919Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015531.xml 2022-11-23T02:07:37.0074343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0074600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0075222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0075415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0075493Z 2022-11-23T02:07:37.0075587Z Running tests... 2022-11-23T02:07:37.0075902Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0076255Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0076573Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0077369Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77325 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.654s) 2022-11-23T02:07:37.0077391Z 2022-11-23T02:07:37.0077752Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0077947Z Ran 1 test in 1.654s 2022-11-23T02:07:37.0077968Z 2022-11-23T02:07:37.0078114Z OK (skipped=1) 2022-11-23T02:07:37.0078134Z 2022-11-23T02:07:37.0078294Z Generating XML reports... 2022-11-23T02:07:37.0078735Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015533.xml 2022-11-23T02:07:37.0079147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0079375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0079802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0080036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0080057Z 2022-11-23T02:07:37.0080204Z Running tests... 2022-11-23T02:07:37.0080539Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0080890Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0081169Z test_ddp_inference (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0081539Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29882 2022-11-23T02:07:37.0081800Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29883 2022-11-23T02:07:37.0082225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0082442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0082921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0083155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0083606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0083833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0084199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0084436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0084720Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0085069Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0085524Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0085964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0086234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0086553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0086857Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp49f454a3 2022-11-23T02:07:37.0087114Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp49f454a3/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0087407Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2j81zkin 2022-11-23T02:07:37.0087718Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2j81zkin/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0088036Z [1669168542.358719] [d8f8c46cdf70:29882:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0088359Z [1669168542.365061] [d8f8c46cdf70:29882:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0088643Z [1669168542.365061] [d8f8c46cdf70:29882:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0088976Z [1669168542.363558] [d8f8c46cdf70:29883:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0089279Z [1669168542.369689] [d8f8c46cdf70:29883:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0089566Z [1669168542.369689] [d8f8c46cdf70:29883:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0089656Z ok (6.169s) 2022-11-23T02:07:37.0089728Z 2022-11-23T02:07:37.0089990Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0090143Z Ran 1 test in 6.169s 2022-11-23T02:07:37.0090163Z 2022-11-23T02:07:37.0090295Z OK 2022-11-23T02:07:37.0090315Z 2022-11-23T02:07:37.0090478Z Generating XML reports... 2022-11-23T02:07:37.0090978Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015537.xml 2022-11-23T02:07:37.0091456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0091714Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0092143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0092328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0092347Z 2022-11-23T02:07:37.0092497Z Running tests... 2022-11-23T02:07:37.0092804Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0093218Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0093532Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0093792Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29996 2022-11-23T02:07:37.0094053Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29997 2022-11-23T02:07:37.0094502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0094667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0095141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0095380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0095797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0096011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0096423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0096647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0096946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0097267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0097667Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0098106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0098387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0098708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0099001Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2r5luf6g 2022-11-23T02:07:37.0099308Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2r5luf6g/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0099603Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwoc_wy48 2022-11-23T02:07:37.0099906Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwoc_wy48/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0100220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0100448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0100774Z [1669168551.502000] [d8f8c46cdf70:29997:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0101049Z [1669168551.509114] [d8f8c46cdf70:29997:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0101330Z [1669168551.509114] [d8f8c46cdf70:29997:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0101701Z [1669168551.497877] [d8f8c46cdf70:29996:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0101974Z [1669168551.504943] [d8f8c46cdf70:29996:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0102254Z [1669168551.504943] [d8f8c46cdf70:29996:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0102717Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:07:37.0102957Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:07:37.0103400Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:07:37.0103550Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:07:37.0103746Z ok (5.985s) 2022-11-23T02:07:37.0103767Z 2022-11-23T02:07:37.0104074Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0104223Z Ran 1 test in 5.985s 2022-11-23T02:07:37.0104244Z 2022-11-23T02:07:37.0104371Z OK 2022-11-23T02:07:37.0104391Z 2022-11-23T02:07:37.0104560Z Generating XML reports... 2022-11-23T02:07:37.0105100Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015546.xml 2022-11-23T02:07:37.0105561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0105729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0106151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0106383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0106408Z 2022-11-23T02:07:37.0106556Z Running tests... 2022-11-23T02:07:37.0106862Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0107224Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0107528Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0107788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30114 2022-11-23T02:07:37.0107992Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30115 2022-11-23T02:07:37.0108444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0108709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0109131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0109364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0109780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0109992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0110410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0110642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0110876Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0111195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0111640Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0112148Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0112420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0112690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0112997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph6ys8xtz 2022-11-23T02:07:37.0113292Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkwu2zlq1 2022-11-23T02:07:37.0113600Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph6ys8xtz/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0113859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkwu2zlq1/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0114222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0114514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0114834Z [1669168558.769567] [d8f8c46cdf70:30115:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0115340Z [1669168559.534441] [d8f8c46cdf70:30115:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0115710Z [1669168559.534441] [d8f8c46cdf70:30115:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0116041Z [1669168558.748107] [d8f8c46cdf70:30114:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0116313Z [1669168559.552303] [d8f8c46cdf70:30114:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0116594Z [1669168559.552303] [d8f8c46cdf70:30114:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0116688Z ok (5.461s) 2022-11-23T02:07:37.0116816Z 2022-11-23T02:07:37.0117086Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0117237Z Ran 1 test in 5.461s 2022-11-23T02:07:37.0117257Z 2022-11-23T02:07:37.0117385Z OK 2022-11-23T02:07:37.0117405Z 2022-11-23T02:07:37.0117571Z Generating XML reports... 2022-11-23T02:07:37.0118064Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015554.xml 2022-11-23T02:07:37.0118479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0118695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0119175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0119362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0119383Z 2022-11-23T02:07:37.0119569Z Running tests... 2022-11-23T02:07:37.0119871Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0120222Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0120529Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0120789Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30258 2022-11-23T02:07:37.0121041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30259 2022-11-23T02:07:37.0121443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0131226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0131690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0132026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0132413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0132587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0132968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0133157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0133392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0133632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0134034Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0134441Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0134666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0134947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0135215Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkc9d8m8o 2022-11-23T02:07:37.0135485Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkc9d8m8o/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0135735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw2grw_in 2022-11-23T02:07:37.0135985Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw2grw_in/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0136260Z [1669168567.544671] [d8f8c46cdf70:30258:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0136501Z [1669168567.551149] [d8f8c46cdf70:30258:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0136748Z [1669168567.551149] [d8f8c46cdf70:30258:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0136982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0137251Z [1669168567.547723] [d8f8c46cdf70:30259:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0137482Z [1669168567.554307] [d8f8c46cdf70:30259:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0137719Z [1669168567.554307] [d8f8c46cdf70:30259:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0137959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0138060Z ok (6.029s) 2022-11-23T02:07:37.0138082Z 2022-11-23T02:07:37.0138348Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0138464Z Ran 1 test in 6.029s 2022-11-23T02:07:37.0138484Z 2022-11-23T02:07:37.0138577Z OK 2022-11-23T02:07:37.0138597Z 2022-11-23T02:07:37.0138719Z Generating XML reports... 2022-11-23T02:07:37.0139168Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015602.xml 2022-11-23T02:07:37.0139546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0139721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0140106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0140345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0140383Z 2022-11-23T02:07:37.0140476Z Running tests... 2022-11-23T02:07:37.0140742Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0141055Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0141336Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0141549Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30376 2022-11-23T02:07:37.0141761Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30377 2022-11-23T02:07:37.0142136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0142311Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0142685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0142874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0143239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0143459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0143845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0144033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0144275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0144516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0144910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0145313Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0145540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0145762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0146001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0146242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0146636Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0147025Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0147266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:37.0147485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:37.0147883Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0148272Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0148524Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiggypag4 2022-11-23T02:07:37.0148795Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiggypag4/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0149043Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprj7eq2ex 2022-11-23T02:07:37.0149369Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprj7eq2ex/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0149645Z [1669168576.188966] [d8f8c46cdf70:30376:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0149885Z [1669168576.195663] [d8f8c46cdf70:30376:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0150130Z [1669168576.195663] [d8f8c46cdf70:30376:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0150390Z [1669168576.196304] [d8f8c46cdf70:30377:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0150620Z [1669168576.203549] [d8f8c46cdf70:30377:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0150858Z [1669168576.203549] [d8f8c46cdf70:30377:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0150964Z ok (5.556s) 2022-11-23T02:07:37.0150985Z 2022-11-23T02:07:37.0151253Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0151363Z Ran 1 test in 5.556s 2022-11-23T02:07:37.0151382Z 2022-11-23T02:07:37.0151472Z OK 2022-11-23T02:07:37.0151541Z 2022-11-23T02:07:37.0151670Z Generating XML reports... 2022-11-23T02:07:37.0152123Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015611.xml 2022-11-23T02:07:37.0152485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0152657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0153035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0153232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0153252Z 2022-11-23T02:07:37.0153359Z Running tests... 2022-11-23T02:07:37.0153621Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0153932Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0154218Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0154424Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30496 2022-11-23T02:07:37.0154642Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30497 2022-11-23T02:07:37.0155008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0155419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0155815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0156001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0156370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0156542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0156915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0157087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0157327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0157569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0157969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0158464Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0158691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0158919Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0159160Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0159400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0159788Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0160177Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0160422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:37.0160658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:37.0161115Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0161518Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0161777Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxsvkchmj 2022-11-23T02:07:37.0162048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxsvkchmj/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0162302Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpryq50qg5 2022-11-23T02:07:37.0162553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpryq50qg5/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0162831Z [1669168584.184603] [d8f8c46cdf70:30496:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0163069Z [1669168584.190397] [d8f8c46cdf70:30496:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0163317Z [1669168584.190397] [d8f8c46cdf70:30496:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0163640Z [1669168594.557338] [d8f8c46cdf70:30496:1] ucc_schedule.h:189 UCC WARN timeout 10 sec. has expired on req 0x55e0abe36e00, seq_num 3, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T02:07:37.0163919Z [1669168594.593444] [d8f8c46cdf70:30496:0] mpool.c:55 UCX WARN object 0x55e0abf48240 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T02:07:37.0164196Z [1669168584.187408] [d8f8c46cdf70:30497:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0164423Z [1669168584.192269] [d8f8c46cdf70:30497:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0164661Z [1669168584.192269] [d8f8c46cdf70:30497:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0165058Z [1669168594.603450] [d8f8c46cdf70:30497:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x55cc5aa26800 was not matched 2022-11-23T02:07:37.0165161Z ok (15.537s) 2022-11-23T02:07:37.0165183Z 2022-11-23T02:07:37.0165435Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0165548Z Ran 1 test in 15.537s 2022-11-23T02:07:37.0165568Z 2022-11-23T02:07:37.0165656Z OK 2022-11-23T02:07:37.0165728Z 2022-11-23T02:07:37.0165853Z Generating XML reports... 2022-11-23T02:07:37.0166305Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015619.xml 2022-11-23T02:07:37.0166679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0166858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0167245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0167422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0167454Z 2022-11-23T02:07:37.0167547Z Running tests... 2022-11-23T02:07:37.0167808Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0168112Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0168420Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0168640Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30616 2022-11-23T02:07:37.0168855Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30617 2022-11-23T02:07:37.0169278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0169465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0169832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0170017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0170382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0170561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0170931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0171118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0171357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0171600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0172001Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0172385Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0172613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0172842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0173099Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm5fa6uyv 2022-11-23T02:07:37.0173369Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm5fa6uyv/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0173624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm8dfo19l 2022-11-23T02:07:37.0173891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm8dfo19l/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0174168Z [1669168602.244866] [d8f8c46cdf70:30617:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0174400Z [1669168602.250350] [d8f8c46cdf70:30617:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0174628Z [1669168602.250350] [d8f8c46cdf70:30617:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0174964Z [1669168602.238813] [d8f8c46cdf70:30616:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0175197Z [1669168602.245076] [d8f8c46cdf70:30616:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0175439Z [1669168602.245076] [d8f8c46cdf70:30616:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0175541Z ok (6.133s) 2022-11-23T02:07:37.0175561Z 2022-11-23T02:07:37.0175830Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0175941Z Ran 1 test in 6.133s 2022-11-23T02:07:37.0175960Z 2022-11-23T02:07:37.0176047Z OK 2022-11-23T02:07:37.0176067Z 2022-11-23T02:07:37.0176187Z Generating XML reports... 2022-11-23T02:07:37.0176625Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015637.xml 2022-11-23T02:07:37.0177002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0177175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0177608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0177806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0177826Z 2022-11-23T02:07:37.0177930Z Running tests... 2022-11-23T02:07:37.0178195Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0178503Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0178789Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0178999Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30734 2022-11-23T02:07:37.0179211Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30735 2022-11-23T02:07:37.0179579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0179755Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0180126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0180314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0180670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0180840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0181238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0181435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0181679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0181921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0182326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0182720Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0182949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0183174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0183430Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph284fvsd 2022-11-23T02:07:37.0183755Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph284fvsd/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0184008Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbyihcj7i 2022-11-23T02:07:37.0184280Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbyihcj7i/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0184559Z [1669168611.005609] [d8f8c46cdf70:30734:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0184796Z [1669168611.011598] [d8f8c46cdf70:30734:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0185039Z [1669168611.011598] [d8f8c46cdf70:30734:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0185313Z [1669168611.002524] [d8f8c46cdf70:30735:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0185548Z [1669168611.015225] [d8f8c46cdf70:30735:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0185847Z [1669168611.015225] [d8f8c46cdf70:30735:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0185955Z ok (6.148s) 2022-11-23T02:07:37.0185976Z 2022-11-23T02:07:37.0186235Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0186342Z Ran 1 test in 6.149s 2022-11-23T02:07:37.0186362Z 2022-11-23T02:07:37.0186450Z OK 2022-11-23T02:07:37.0186469Z 2022-11-23T02:07:37.0186591Z Generating XML reports... 2022-11-23T02:07:37.0187039Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015646.xml 2022-11-23T02:07:37.0187413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0187591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0187973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0188155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0188185Z 2022-11-23T02:07:37.0188279Z Running tests... 2022-11-23T02:07:37.0188544Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0188848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0189102Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0189315Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30852 2022-11-23T02:07:37.0189526Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30853 2022-11-23T02:07:37.0189905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0190081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0190451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0190644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0191014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0191179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0191557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0191740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0192047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0192288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0192679Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0193080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0193304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0193524Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0193777Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsdqlu1hh 2022-11-23T02:07:37.0194046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsdqlu1hh/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0194303Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo24ahc7e 2022-11-23T02:07:37.0194565Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo24ahc7e/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0194888Z [1669168619.638428] [d8f8c46cdf70:30852:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0195344Z [1669168619.644429] [d8f8c46cdf70:30852:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0195593Z [1669168619.644429] [d8f8c46cdf70:30852:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0195868Z [1669168619.643570] [d8f8c46cdf70:30853:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0196162Z [1669168619.649969] [d8f8c46cdf70:30853:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0196398Z [1669168619.649969] [d8f8c46cdf70:30853:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0196497Z ok (6.052s) 2022-11-23T02:07:37.0196517Z 2022-11-23T02:07:37.0196797Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0196906Z Ran 1 test in 6.052s 2022-11-23T02:07:37.0196925Z 2022-11-23T02:07:37.0197015Z OK 2022-11-23T02:07:37.0197034Z 2022-11-23T02:07:37.0197152Z Generating XML reports... 2022-11-23T02:07:37.0197588Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015654.xml 2022-11-23T02:07:37.0197964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0198137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0198522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0198710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0198730Z 2022-11-23T02:07:37.0198836Z Running tests... 2022-11-23T02:07:37.0199099Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0199403Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0199652Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0199864Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30966 2022-11-23T02:07:37.0200077Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30967 2022-11-23T02:07:37.0200447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0200758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0201147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0201336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0201702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0201870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0202231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0202418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0202659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0202902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0203299Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0203698Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0203984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0204215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0204455Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp30jw7zxk 2022-11-23T02:07:37.0204719Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp30jw7zxk/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0204969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3a9emfyd 2022-11-23T02:07:37.0205239Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3a9emfyd/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0205512Z [1669168628.177351] [d8f8c46cdf70:30967:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0205744Z [1669168628.184023] [d8f8c46cdf70:30967:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0205987Z [1669168628.184023] [d8f8c46cdf70:30967:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0206765Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:37.0207044Z [1669168628.170760] [d8f8c46cdf70:30966:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0207281Z [1669168628.176135] [d8f8c46cdf70:30966:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0207516Z [1669168628.176135] [d8f8c46cdf70:30966:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0208290Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:37.0208589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0208825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0208912Z ok (5.927s) 2022-11-23T02:07:37.0208932Z 2022-11-23T02:07:37.0209207Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0209316Z Ran 1 test in 5.927s 2022-11-23T02:07:37.0209336Z 2022-11-23T02:07:37.0209425Z OK 2022-11-23T02:07:37.0209444Z 2022-11-23T02:07:37.0209566Z Generating XML reports... 2022-11-23T02:07:37.0210014Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015703.xml 2022-11-23T02:07:37.0210386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0210559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0210928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0211168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0211190Z 2022-11-23T02:07:37.0211300Z Running tests... 2022-11-23T02:07:37.0211564Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0211868Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0212148Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0212892Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78338 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.649s) 2022-11-23T02:07:37.0212917Z 2022-11-23T02:07:37.0213180Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0213292Z Ran 1 test in 1.649s 2022-11-23T02:07:37.0213312Z 2022-11-23T02:07:37.0213419Z OK (skipped=1) 2022-11-23T02:07:37.0213438Z 2022-11-23T02:07:37.0213546Z Generating XML reports... 2022-11-23T02:07:37.0213989Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015712.xml 2022-11-23T02:07:37.0214355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0214531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0214906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0215101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0215121Z 2022-11-23T02:07:37.0215227Z Running tests... 2022-11-23T02:07:37.0215488Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0215787Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0216068Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0216806Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77342 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.651s) 2022-11-23T02:07:37.0216827Z 2022-11-23T02:07:37.0217093Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0217271Z Ran 1 test in 1.651s 2022-11-23T02:07:37.0217290Z 2022-11-23T02:07:37.0217394Z OK (skipped=1) 2022-11-23T02:07:37.0217413Z 2022-11-23T02:07:37.0217533Z Generating XML reports... 2022-11-23T02:07:37.0217982Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015716.xml 2022-11-23T02:07:37.0218357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0218533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0218900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0219091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0219111Z 2022-11-23T02:07:37.0219216Z Running tests... 2022-11-23T02:07:37.0219479Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0219790Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0220067Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0220286Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31152 2022-11-23T02:07:37.0220564Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31153 2022-11-23T02:07:37.0220950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0221110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0221482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0221669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0222037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0222211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0222584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0222771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0223012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0223240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0223642Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0224036Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0224267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0224494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0224748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpno8rqld0 2022-11-23T02:07:37.0225017Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpno8rqld0/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0225273Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr4yme2qu 2022-11-23T02:07:37.0225535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr4yme2qu/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0225801Z [1669168645.036034] [d8f8c46cdf70:31153:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0226037Z [1669168645.041528] [d8f8c46cdf70:31153:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0226342Z [1669168645.041528] [d8f8c46cdf70:31153:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0226692Z STAGE:2022-11-23 01:57:25 31153:31153 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0226966Z [1669168645.031910] [d8f8c46cdf70:31152:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0227197Z [1669168645.037191] [d8f8c46cdf70:31152:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0227434Z [1669168645.037191] [d8f8c46cdf70:31152:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0227771Z STAGE:2022-11-23 01:57:25 31152:31152 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0228013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0228243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0228572Z STAGE:2022-11-23 01:57:26 31152:31152 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0228946Z STAGE:2022-11-23 01:57:26 31153:31153 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0229303Z STAGE:2022-11-23 01:57:26 31152:31152 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0229648Z STAGE:2022-11-23 01:57:26 31153:31153 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0230424Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:37.0231199Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:37.0231535Z STAGE:2022-11-23 01:57:26 31152:31152 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0231865Z STAGE:2022-11-23 01:57:26 31153:31153 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0232205Z STAGE:2022-11-23 01:57:26 31152:31152 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0232532Z STAGE:2022-11-23 01:57:26 31153:31153 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0232881Z STAGE:2022-11-23 01:57:26 31152:31152 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0233228Z STAGE:2022-11-23 01:57:26 31153:31153 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0233315Z ok (6.527s) 2022-11-23T02:07:37.0233350Z 2022-11-23T02:07:37.0233603Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0233711Z Ran 1 test in 6.527s 2022-11-23T02:07:37.0233730Z 2022-11-23T02:07:37.0233818Z OK 2022-11-23T02:07:37.0233837Z 2022-11-23T02:07:37.0233960Z Generating XML reports... 2022-11-23T02:07:37.0234410Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015720.xml 2022-11-23T02:07:37.0234845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0235228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0235634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0235815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0235844Z 2022-11-23T02:07:37.0235937Z Running tests... 2022-11-23T02:07:37.0236199Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0236509Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0236773Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0236999Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31274 2022-11-23T02:07:37.0237212Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31275 2022-11-23T02:07:37.0237582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0237833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0238214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0238403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0238766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0238933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0239300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0239491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0239731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0239978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0240369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0240771Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0240999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0241217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0241475Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnq8yu6fd 2022-11-23T02:07:37.0241745Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnq8yu6fd/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0241990Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp45ze7ndu 2022-11-23T02:07:37.0242261Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp45ze7ndu/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0242541Z [1669168654.208008] [d8f8c46cdf70:31275:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0242764Z [1669168654.215154] [d8f8c46cdf70:31275:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0243006Z [1669168654.215154] [d8f8c46cdf70:31275:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0243277Z [1669168654.207986] [d8f8c46cdf70:31274:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0243588Z [1669168654.213494] [d8f8c46cdf70:31274:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0243827Z [1669168654.213494] [d8f8c46cdf70:31274:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0243928Z ok (5.558s) 2022-11-23T02:07:37.0243948Z 2022-11-23T02:07:37.0244221Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0244330Z Ran 1 test in 5.558s 2022-11-23T02:07:37.0244349Z 2022-11-23T02:07:37.0244439Z OK 2022-11-23T02:07:37.0244458Z 2022-11-23T02:07:37.0244576Z Generating XML reports... 2022-11-23T02:07:37.0245011Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015729.xml 2022-11-23T02:07:37.0245392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0245567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0245945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0246193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0246214Z 2022-11-23T02:07:37.0246323Z Running tests... 2022-11-23T02:07:37.0246587Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0246898Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0247160Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0247906Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78595 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.644s) 2022-11-23T02:07:37.0247943Z 2022-11-23T02:07:37.0248190Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0248302Z Ran 1 test in 1.645s 2022-11-23T02:07:37.0248322Z 2022-11-23T02:07:37.0248431Z OK (skipped=1) 2022-11-23T02:07:37.0248450Z 2022-11-23T02:07:37.0248570Z Generating XML reports... 2022-11-23T02:07:37.0249014Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015737.xml 2022-11-23T02:07:37.0249386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0249563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0249933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0250117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0250136Z 2022-11-23T02:07:37.0250241Z Running tests... 2022-11-23T02:07:37.0250499Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0250810Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0251087Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0251300Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31422 2022-11-23T02:07:37.0251518Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31423 2022-11-23T02:07:37.0251891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0252053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0252493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0252680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0253050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0253223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0253596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0253782Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0254020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0254261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0254652Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0255049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0255321Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0255560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0255816Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpngrh9664 2022-11-23T02:07:37.0256080Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpngrh9664/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0256331Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpknxmjm0z 2022-11-23T02:07:37.0256601Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpknxmjm0z/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0257525Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:07:37.0257635Z warnings.warn( 2022-11-23T02:07:37.0258528Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:07:37.0258640Z warnings.warn( 2022-11-23T02:07:37.0258876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0259106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0259386Z [1669168666.577643] [d8f8c46cdf70:31423:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0259624Z [1669168666.583583] [d8f8c46cdf70:31423:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0259865Z [1669168666.583583] [d8f8c46cdf70:31423:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0260143Z [1669168666.576632] [d8f8c46cdf70:31422:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0260371Z [1669168666.583559] [d8f8c46cdf70:31422:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0260664Z [1669168666.583559] [d8f8c46cdf70:31422:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0260751Z ok (6.044s) 2022-11-23T02:07:37.0260782Z 2022-11-23T02:07:37.0261039Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0261149Z Ran 1 test in 6.044s 2022-11-23T02:07:37.0261169Z 2022-11-23T02:07:37.0261261Z OK 2022-11-23T02:07:37.0261280Z 2022-11-23T02:07:37.0261404Z Generating XML reports... 2022-11-23T02:07:37.0261852Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015741.xml 2022-11-23T02:07:37.0262226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0262403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0262784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0262966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0262985Z 2022-11-23T02:07:37.0263090Z Running tests... 2022-11-23T02:07:37.0263350Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0263707Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0263994Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0264743Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77625 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.609s) 2022-11-23T02:07:37.0264763Z 2022-11-23T02:07:37.0265026Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0265138Z Ran 1 test in 1.609s 2022-11-23T02:07:37.0265158Z 2022-11-23T02:07:37.0265261Z OK (skipped=1) 2022-11-23T02:07:37.0265280Z 2022-11-23T02:07:37.0265389Z Generating XML reports... 2022-11-23T02:07:37.0265830Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015750.xml 2022-11-23T02:07:37.0266208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0266387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0266767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0266959Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0266978Z 2022-11-23T02:07:37.0267083Z Running tests... 2022-11-23T02:07:37.0267345Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0267662Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0267916Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0268132Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31574 2022-11-23T02:07:37.0268348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31575 2022-11-23T02:07:37.0268718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0268890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0269265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0269453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0269893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0270051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0270421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0270610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0270857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0271097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0271497Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0271890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0272124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0272352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0272596Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8ghiwvwu 2022-11-23T02:07:37.0272916Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8ghiwvwu/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0273183Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpktb626lr 2022-11-23T02:07:37.0273448Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpktb626lr/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0273723Z [1669168679.270510] [d8f8c46cdf70:31575:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0273958Z [1669168679.276680] [d8f8c46cdf70:31575:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0274203Z [1669168679.276680] [d8f8c46cdf70:31575:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0274549Z STAGE:2022-11-23 01:57:59 31575:31575 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0274825Z [1669168679.266688] [d8f8c46cdf70:31574:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0275259Z [1669168679.272298] [d8f8c46cdf70:31574:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0275498Z [1669168679.272298] [d8f8c46cdf70:31574:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0275838Z STAGE:2022-11-23 01:57:59 31574:31574 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0276082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0276310Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:07:37.0276652Z STAGE:2022-11-23 01:57:59 31575:31575 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0276986Z STAGE:2022-11-23 01:57:59 31574:31574 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0277337Z STAGE:2022-11-23 01:57:59 31574:31574 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0277679Z STAGE:2022-11-23 01:57:59 31575:31575 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0278006Z STAGE:2022-11-23 01:58:00 31574:31574 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0278327Z STAGE:2022-11-23 01:58:00 31574:31574 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0278765Z STAGE:2022-11-23 01:58:00 31574:31574 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0278862Z ok (6.753s) 2022-11-23T02:07:37.0278883Z 2022-11-23T02:07:37.0279148Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0279261Z Ran 1 test in 6.753s 2022-11-23T02:07:37.0279281Z 2022-11-23T02:07:37.0279372Z OK 2022-11-23T02:07:37.0279395Z 2022-11-23T02:07:37.0279521Z Generating XML reports... 2022-11-23T02:07:37.0279967Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015754.xml 2022-11-23T02:07:37.0280328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0280501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0280880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0281110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0281131Z 2022-11-23T02:07:37.0281234Z Running tests... 2022-11-23T02:07:37.0281497Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0281804Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0282124Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0282355Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31696 2022-11-23T02:07:37.0282558Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31697 2022-11-23T02:07:37.0282934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0283105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0283483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0283674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0284035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0284210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0284585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0284759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0285002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0285242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0285635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0286039Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0286265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0286494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0286753Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7xnj9kny 2022-11-23T02:07:37.0287017Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7xnj9kny/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0287254Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoffczh96 2022-11-23T02:07:37.0287514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoffczh96/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0287854Z [1669168688.685462] [d8f8c46cdf70:31697:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0288081Z [1669168688.690876] [d8f8c46cdf70:31697:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0288325Z [1669168688.690876] [d8f8c46cdf70:31697:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0288596Z [1669168688.677378] [d8f8c46cdf70:31696:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0288830Z [1669168688.684286] [d8f8c46cdf70:31696:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0289075Z [1669168688.684286] [d8f8c46cdf70:31696:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0289182Z ok (5.560s) 2022-11-23T02:07:37.0289202Z 2022-11-23T02:07:37.0289472Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0289568Z Ran 1 test in 5.560s 2022-11-23T02:07:37.0289588Z 2022-11-23T02:07:37.0289671Z OK 2022-11-23T02:07:37.0289690Z 2022-11-23T02:07:37.0289810Z Generating XML reports... 2022-11-23T02:07:37.0290303Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015803.xml 2022-11-23T02:07:37.0290684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0290858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0291236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0291420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0291440Z 2022-11-23T02:07:37.0291552Z Running tests... 2022-11-23T02:07:37.0291802Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0292108Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0292379Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0292596Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31810 2022-11-23T02:07:37.0292814Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31811 2022-11-23T02:07:37.0293183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0293353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0293722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0293903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0294258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0294432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0294806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0294994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0295235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0295471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0295870Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0296265Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0296540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0296766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0297023Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4el_yirh 2022-11-23T02:07:37.0297287Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4el_yirh/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0297539Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg586vpa6 2022-11-23T02:07:37.0297794Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg586vpa6/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0298067Z [1669168696.689530] [d8f8c46cdf70:31810:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0298305Z [1669168696.696122] [d8f8c46cdf70:31810:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0298543Z [1669168696.696122] [d8f8c46cdf70:31810:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0298853Z [1669168696.698328] [d8f8c46cdf70:31811:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0299094Z [1669168696.703964] [d8f8c46cdf70:31811:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0299329Z [1669168696.703964] [d8f8c46cdf70:31811:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0299425Z ok (5.448s) 2022-11-23T02:07:37.0299447Z 2022-11-23T02:07:37.0299710Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0299827Z Ran 1 test in 5.449s 2022-11-23T02:07:37.0299846Z 2022-11-23T02:07:37.0299931Z OK 2022-11-23T02:07:37.0299950Z 2022-11-23T02:07:37.0300068Z Generating XML reports... 2022-11-23T02:07:37.0300518Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015812.xml 2022-11-23T02:07:37.0300882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0301057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0301435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0301620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0301640Z 2022-11-23T02:07:37.0301747Z Running tests... 2022-11-23T02:07:37.0302003Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0302324Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0302594Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0303341Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78684 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.649s) 2022-11-23T02:07:37.0303363Z 2022-11-23T02:07:37.0303623Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0303720Z Ran 1 test in 1.649s 2022-11-23T02:07:37.0303740Z 2022-11-23T02:07:37.0303844Z OK (skipped=1) 2022-11-23T02:07:37.0303863Z 2022-11-23T02:07:37.0303984Z Generating XML reports... 2022-11-23T02:07:37.0304431Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015820.xml 2022-11-23T02:07:37.0304869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0305041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0305424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0305614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0305634Z 2022-11-23T02:07:37.0305726Z Running tests... 2022-11-23T02:07:37.0305985Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0306295Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0306548Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0307286Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/75648 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.656s) 2022-11-23T02:07:37.0307310Z 2022-11-23T02:07:37.0307572Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0307737Z Ran 1 test in 1.656s 2022-11-23T02:07:37.0307760Z 2022-11-23T02:07:37.0307873Z OK (skipped=1) 2022-11-23T02:07:37.0307892Z 2022-11-23T02:07:37.0308015Z Generating XML reports... 2022-11-23T02:07:37.0308461Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015824.xml 2022-11-23T02:07:37.0308820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0308995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0309383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0309575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0309594Z 2022-11-23T02:07:37.0309697Z Running tests... 2022-11-23T02:07:37.0309959Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0310274Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0310564Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0311301Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78113 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.610s) 2022-11-23T02:07:37.0311326Z 2022-11-23T02:07:37.0311580Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0311676Z Ran 1 test in 1.610s 2022-11-23T02:07:37.0311696Z 2022-11-23T02:07:37.0311797Z OK (skipped=1) 2022-11-23T02:07:37.0311816Z 2022-11-23T02:07:37.0311936Z Generating XML reports... 2022-11-23T02:07:37.0312378Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015828.xml 2022-11-23T02:07:37.0312750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0312924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0313304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0313494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0313513Z 2022-11-23T02:07:37.0313663Z Running tests... 2022-11-23T02:07:37.0313924Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0314229Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0314526Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0314746Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32026 2022-11-23T02:07:37.0314962Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32027 2022-11-23T02:07:37.0315646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0315822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0316201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0316380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0316740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0316912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0317357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0317553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0317795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0318037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0318435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0318825Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0319055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0319276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0319532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmsvr1p75 2022-11-23T02:07:37.0319801Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmsvr1p75/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0320047Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz5zxcu7m 2022-11-23T02:07:37.0320312Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz5zxcu7m/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0320587Z [1669168717.212858] [d8f8c46cdf70:32026:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0320830Z [1669168717.219957] [d8f8c46cdf70:32026:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0321071Z [1669168717.219957] [d8f8c46cdf70:32026:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0321335Z [1669168717.219434] [d8f8c46cdf70:32027:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0321561Z [1669168717.224888] [d8f8c46cdf70:32027:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0321794Z [1669168717.224888] [d8f8c46cdf70:32027:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0321894Z ok (6.012s) 2022-11-23T02:07:37.0321914Z 2022-11-23T02:07:37.0322183Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0322366Z Ran 1 test in 6.012s 2022-11-23T02:07:37.0322385Z 2022-11-23T02:07:37.0322472Z OK 2022-11-23T02:07:37.0322491Z 2022-11-23T02:07:37.0322608Z Generating XML reports... 2022-11-23T02:07:37.0323058Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015832.xml 2022-11-23T02:07:37.0323422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0323596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0323980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0324166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0324186Z 2022-11-23T02:07:37.0324285Z Running tests... 2022-11-23T02:07:37.0324546Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0324856Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0325121Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0325325Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32144 2022-11-23T02:07:37.0325590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32145 2022-11-23T02:07:37.0325978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0326153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0326531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0326710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0327070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0327247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0327620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0327795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0328041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0328278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0328676Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0329075Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0329302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0329681Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T02:07:37.0329931Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T02:07:37.0330155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0330516Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T02:07:37.0330761Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T02:07:37.0331010Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9q5djd0h 2022-11-23T02:07:37.0331278Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9q5djd0h/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0331530Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp07daz_un 2022-11-23T02:07:37.0331848Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp07daz_un/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0332120Z [1669168725.827003] [d8f8c46cdf70:32144:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0332359Z [1669168725.833730] [d8f8c46cdf70:32144:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0332601Z [1669168725.833730] [d8f8c46cdf70:32144:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0332865Z [1669168725.828967] [d8f8c46cdf70:32145:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0333095Z [1669168725.834647] [d8f8c46cdf70:32145:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0333335Z [1669168725.834647] [d8f8c46cdf70:32145:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0333437Z ok (5.547s) 2022-11-23T02:07:37.0333456Z 2022-11-23T02:07:37.0333727Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0333838Z Ran 1 test in 5.547s 2022-11-23T02:07:37.0333857Z 2022-11-23T02:07:37.0334047Z OK 2022-11-23T02:07:37.0334068Z 2022-11-23T02:07:37.0334198Z Generating XML reports... 2022-11-23T02:07:37.0334651Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015841.xml 2022-11-23T02:07:37.0335015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0335190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0335571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0335770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0335789Z 2022-11-23T02:07:37.0335896Z Running tests... 2022-11-23T02:07:37.0336154Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0336467Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0336724Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0336928Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32258 2022-11-23T02:07:37.0337142Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32259 2022-11-23T02:07:37.0337507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0337678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0338058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0338248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0338608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0338781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0339159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0339335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0339578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0339818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0340280Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0340677Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0340898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0341146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0341365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0341602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0341987Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0342381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0342481Z ok (4.428s) 2022-11-23T02:07:37.0342502Z 2022-11-23T02:07:37.0342770Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0342878Z Ran 1 test in 4.428s 2022-11-23T02:07:37.0342897Z 2022-11-23T02:07:37.0342984Z OK 2022-11-23T02:07:37.0343002Z 2022-11-23T02:07:37.0343167Z Generating XML reports... 2022-11-23T02:07:37.0343623Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015849.xml 2022-11-23T02:07:37.0343978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0344150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0344524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0344718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0344738Z 2022-11-23T02:07:37.0344838Z Running tests... 2022-11-23T02:07:37.0345095Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0345401Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0345655Z test_destroy_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0345870Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32361 2022-11-23T02:07:37.0346072Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32362 2022-11-23T02:07:37.0346446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0346617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0346993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0347187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0347553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0347721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0348092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0348271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0348516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0348759Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0349159Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0349617Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0349840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0350069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0350304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0350540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0350925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0351315Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0351420Z ok (4.161s) 2022-11-23T02:07:37.0351440Z 2022-11-23T02:07:37.0351700Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0351809Z Ran 1 test in 4.161s 2022-11-23T02:07:37.0351829Z 2022-11-23T02:07:37.0351914Z OK 2022-11-23T02:07:37.0351932Z 2022-11-23T02:07:37.0352051Z Generating XML reports... 2022-11-23T02:07:37.0352547Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015856.xml 2022-11-23T02:07:37.0352926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0353087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0353469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0353657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0353680Z 2022-11-23T02:07:37.0353787Z Running tests... 2022-11-23T02:07:37.0354041Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0354349Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0354632Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0355594Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78767 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.658s) 2022-11-23T02:07:37.0355618Z 2022-11-23T02:07:37.0355880Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0355977Z Ran 1 test in 1.658s 2022-11-23T02:07:37.0356004Z 2022-11-23T02:07:37.0356096Z OK (skipped=1) 2022-11-23T02:07:37.0356119Z 2022-11-23T02:07:37.0356239Z Generating XML reports... 2022-11-23T02:07:37.0356681Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015902.xml 2022-11-23T02:07:37.0357050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0357224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0357603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0357792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0357812Z 2022-11-23T02:07:37.0357917Z Running tests... 2022-11-23T02:07:37.0358165Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0358473Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0358845Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0359597Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78748 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.641s) 2022-11-23T02:07:37.0359618Z 2022-11-23T02:07:37.0359874Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0359979Z Ran 1 test in 1.642s 2022-11-23T02:07:37.0359999Z 2022-11-23T02:07:37.0360098Z OK (skipped=1) 2022-11-23T02:07:37.0360117Z 2022-11-23T02:07:37.0360238Z Generating XML reports... 2022-11-23T02:07:37.0360680Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015907.xml 2022-11-23T02:07:37.0361055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0361222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0361601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0361790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0361873Z 2022-11-23T02:07:37.0361984Z Running tests... 2022-11-23T02:07:37.0362242Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0362547Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0362820Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0363038Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32532 2022-11-23T02:07:37.0363240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32533 2022-11-23T02:07:37.0363615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0363786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0364164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0364353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0364718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0364884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0365251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0365437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0365675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0365921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0366315Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0366715Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0366938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0367161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0367260Z ok (4.210s) 2022-11-23T02:07:37.0367279Z 2022-11-23T02:07:37.0367540Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0367636Z Ran 1 test in 4.210s 2022-11-23T02:07:37.0367716Z 2022-11-23T02:07:37.0367797Z OK 2022-11-23T02:07:37.0367816Z 2022-11-23T02:07:37.0367940Z Generating XML reports... 2022-11-23T02:07:37.0368389Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015911.xml 2022-11-23T02:07:37.0368762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0368939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0369316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0369508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0369528Z 2022-11-23T02:07:37.0369629Z Running tests... 2022-11-23T02:07:37.0369879Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0370187Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0370443Z test_gather (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0370463Z 2022-11-23T02:07:37.0370727Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0370837Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0370857Z 2022-11-23T02:07:37.0371018Z OK (skipped=1) 2022-11-23T02:07:37.0371039Z 2022-11-23T02:07:37.0371166Z Generating XML reports... 2022-11-23T02:07:37.0371615Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015918.xml 2022-11-23T02:07:37.0372031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0372195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0372573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0372767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0372786Z 2022-11-23T02:07:37.0372892Z Running tests... 2022-11-23T02:07:37.0373148Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0373458Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0373723Z test_gather_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0373743Z 2022-11-23T02:07:37.0374000Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0374107Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0374126Z 2022-11-23T02:07:37.0374218Z OK (skipped=1) 2022-11-23T02:07:37.0374237Z 2022-11-23T02:07:37.0374353Z Generating XML reports... 2022-11-23T02:07:37.0374795Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015920.xml 2022-11-23T02:07:37.0375164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0375333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0375712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0375903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0375924Z 2022-11-23T02:07:37.0376031Z Running tests... 2022-11-23T02:07:37.0376280Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0376586Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0376836Z test_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T02:07:37.0376856Z 2022-11-23T02:07:37.0377175Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0377285Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0377305Z 2022-11-23T02:07:37.0377408Z OK (skipped=1) 2022-11-23T02:07:37.0377428Z 2022-11-23T02:07:37.0377549Z Generating XML reports... 2022-11-23T02:07:37.0377995Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015922.xml 2022-11-23T02:07:37.0378363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0378526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0378908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0379095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0379115Z 2022-11-23T02:07:37.0379223Z Running tests... 2022-11-23T02:07:37.0379486Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0379788Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0380056Z test_gather_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0380076Z 2022-11-23T02:07:37.0380378Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0380494Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0380515Z 2022-11-23T02:07:37.0380606Z OK (skipped=1) 2022-11-23T02:07:37.0380625Z 2022-11-23T02:07:37.0380747Z Generating XML reports... 2022-11-23T02:07:37.0381229Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015925.xml 2022-11-23T02:07:37.0381600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0381777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0382151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0382338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0382357Z 2022-11-23T02:07:37.0382461Z Running tests... 2022-11-23T02:07:37.0382711Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0383015Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0383281Z test_gather_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0383301Z 2022-11-23T02:07:37.0383556Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0383662Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0383681Z 2022-11-23T02:07:37.0383784Z OK (skipped=1) 2022-11-23T02:07:37.0383807Z 2022-11-23T02:07:37.0383927Z Generating XML reports... 2022-11-23T02:07:37.0384374Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015927.xml 2022-11-23T02:07:37.0384744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0384909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0385289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0385473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0385492Z 2022-11-23T02:07:37.0385596Z Running tests... 2022-11-23T02:07:37.0385855Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0386160Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0386488Z test_gather_object (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0386509Z 2022-11-23T02:07:37.0386767Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0386873Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0386893Z 2022-11-23T02:07:37.0386984Z OK (skipped=1) 2022-11-23T02:07:37.0387003Z 2022-11-23T02:07:37.0387127Z Generating XML reports... 2022-11-23T02:07:37.0387568Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015930.xml 2022-11-23T02:07:37.0387938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0388109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0388483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0388673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0388694Z 2022-11-23T02:07:37.0388798Z Running tests... 2022-11-23T02:07:37.0389047Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0389347Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0389669Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0389691Z 2022-11-23T02:07:37.0389953Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0390061Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0390080Z 2022-11-23T02:07:37.0390182Z OK (skipped=1) 2022-11-23T02:07:37.0390200Z 2022-11-23T02:07:37.0390314Z Generating XML reports... 2022-11-23T02:07:37.0390758Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015932.xml 2022-11-23T02:07:37.0391132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0391295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0391677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0391866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0391886Z 2022-11-23T02:07:37.0391992Z Running tests... 2022-11-23T02:07:37.0392251Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0392553Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0392796Z test_get_backend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0393010Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32866 2022-11-23T02:07:37.0393229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32867 2022-11-23T02:07:37.0393589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0393762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0394145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0394333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0394696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0394869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0395460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0395646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0395961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0396206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0396622Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0397015Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0397240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0397478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0397696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0397933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0398330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0398709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0398866Z ok (4.338s) 2022-11-23T02:07:37.0398889Z 2022-11-23T02:07:37.0399162Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0399269Z Ran 1 test in 4.338s 2022-11-23T02:07:37.0399289Z 2022-11-23T02:07:37.0399373Z OK 2022-11-23T02:07:37.0399393Z 2022-11-23T02:07:37.0399512Z Generating XML reports... 2022-11-23T02:07:37.0399960Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015934.xml 2022-11-23T02:07:37.0400328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0400504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0400873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0401065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0401085Z 2022-11-23T02:07:37.0401191Z Running tests... 2022-11-23T02:07:37.0401451Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0401754Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0402027Z test_get_future (__main__.TestDistBackendWithSpawn) ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:07:37.0402047Z 2022-11-23T02:07:37.0402305Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0402412Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0402435Z 2022-11-23T02:07:37.0402538Z OK (skipped=1) 2022-11-23T02:07:37.0402557Z 2022-11-23T02:07:37.0402664Z Generating XML reports... 2022-11-23T02:07:37.0403102Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015941.xml 2022-11-23T02:07:37.0403477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0403650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0404025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0404210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0404229Z 2022-11-23T02:07:37.0404330Z Running tests... 2022-11-23T02:07:37.0404587Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0404883Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0405183Z test_get_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0405403Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33002 2022-11-23T02:07:37.0405618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33003 2022-11-23T02:07:37.0405992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0406163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0406535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0406725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0407088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0407251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0407626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0407805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0408095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0408343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0408741Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0409136Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0409365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0409597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0409684Z ok (4.533s) 2022-11-23T02:07:37.0409704Z 2022-11-23T02:07:37.0409961Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0410067Z Ran 1 test in 4.533s 2022-11-23T02:07:37.0410087Z 2022-11-23T02:07:37.0410174Z OK 2022-11-23T02:07:37.0410197Z 2022-11-23T02:07:37.0410317Z Generating XML reports... 2022-11-23T02:07:37.0410765Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015944.xml 2022-11-23T02:07:37.0411127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0411302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0411668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0411864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0411884Z 2022-11-23T02:07:37.0411988Z Running tests... 2022-11-23T02:07:37.0412252Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0412558Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0412827Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0413040Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33105 2022-11-23T02:07:37.0413247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33106 2022-11-23T02:07:37.0413609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0413782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0414221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0414404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0414768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0414946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0415316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0415500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0415746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0415975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0416381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0416772Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0416997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0417282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0417514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0417750Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0418144Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0418536Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0418629Z ok (4.370s) 2022-11-23T02:07:37.0418648Z 2022-11-23T02:07:37.0418910Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0419017Z Ran 1 test in 4.370s 2022-11-23T02:07:37.0419037Z 2022-11-23T02:07:37.0419123Z OK 2022-11-23T02:07:37.0419142Z 2022-11-23T02:07:37.0419260Z Generating XML reports... 2022-11-23T02:07:37.0419707Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015951.xml 2022-11-23T02:07:37.0420077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0420253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0420631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0420807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0420831Z 2022-11-23T02:07:37.0420938Z Running tests... 2022-11-23T02:07:37.0421194Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0421503Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0421765Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0421987Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33208 2022-11-23T02:07:37.0422204Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33209 2022-11-23T02:07:37.0422579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0422739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0423119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0423374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0423741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0423907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0424286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0424469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0424708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0424946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0425333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0425735Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0425960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0426199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0426466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0426705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0427100Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0427486Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0427583Z ok (4.344s) 2022-11-23T02:07:37.0427607Z 2022-11-23T02:07:37.0427858Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0427966Z Ran 1 test in 4.344s 2022-11-23T02:07:37.0427985Z 2022-11-23T02:07:37.0428069Z OK 2022-11-23T02:07:37.0428089Z 2022-11-23T02:07:37.0428214Z Generating XML reports... 2022-11-23T02:07:37.0428670Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015958.xml 2022-11-23T02:07:37.0429043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0429214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0429594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0429772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0429800Z 2022-11-23T02:07:37.0429897Z Running tests... 2022-11-23T02:07:37.0430161Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0430468Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0430731Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0430950Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33311 2022-11-23T02:07:37.0431161Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33312 2022-11-23T02:07:37.0431533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0431703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0432071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0432261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0432688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0432861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0433237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0433424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0433664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0433898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0434286Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0434678Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0434912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0435349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0435699Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptjhu9y11 2022-11-23T02:07:37.0435980Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptjhu9y11/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0436232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2makovlb 2022-11-23T02:07:37.0436502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2makovlb/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0436774Z [1669168809.864425] [d8f8c46cdf70:33312:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0437003Z [1669168809.870735] [d8f8c46cdf70:33312:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0437242Z [1669168809.870735] [d8f8c46cdf70:33312:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0437520Z [1669168809.855405] [d8f8c46cdf70:33311:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0437752Z [1669168809.862406] [d8f8c46cdf70:33311:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0437984Z [1669168809.862406] [d8f8c46cdf70:33311:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0438084Z ok (6.059s) 2022-11-23T02:07:37.0438105Z 2022-11-23T02:07:37.0438378Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0438490Z Ran 1 test in 6.059s 2022-11-23T02:07:37.0438510Z 2022-11-23T02:07:37.0438599Z OK 2022-11-23T02:07:37.0438619Z 2022-11-23T02:07:37.0438739Z Generating XML reports... 2022-11-23T02:07:37.0439174Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020005.xml 2022-11-23T02:07:37.0439551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0439730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0440108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0440293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0440313Z 2022-11-23T02:07:37.0440420Z Running tests... 2022-11-23T02:07:37.0440683Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0441070Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0441296Z test_irecv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0441511Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33429 2022-11-23T02:07:37.0441721Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33430 2022-11-23T02:07:37.0442096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0442268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0442642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0442826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0443189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0443363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0443728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0443911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0444206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0444453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0444852Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0445241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0445468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0445699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0445972Z [1669168817.718784] [d8f8c46cdf70:33430:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0446198Z [1669168818.487494] [d8f8c46cdf70:33430:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0446438Z [1669168818.487494] [d8f8c46cdf70:33430:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0446711Z [1669168817.697035] [d8f8c46cdf70:33429:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0446941Z [1669168818.470246] [d8f8c46cdf70:33429:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0447186Z [1669168818.470246] [d8f8c46cdf70:33429:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0447285Z ok (5.560s) 2022-11-23T02:07:37.0447305Z 2022-11-23T02:07:37.0447576Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0447687Z Ran 1 test in 5.560s 2022-11-23T02:07:37.0447707Z 2022-11-23T02:07:37.0447798Z OK 2022-11-23T02:07:37.0447818Z 2022-11-23T02:07:37.0447926Z Generating XML reports... 2022-11-23T02:07:37.0448372Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020013.xml 2022-11-23T02:07:37.0448744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0448916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0449292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0449544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0449564Z 2022-11-23T02:07:37.0449669Z Running tests... 2022-11-23T02:07:37.0449930Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0450231Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0450469Z test_isend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0450681Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33539 2022-11-23T02:07:37.0450893Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33540 2022-11-23T02:07:37.0451262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0451436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0451816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0452005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0452366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0452573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0452967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0453153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0453397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0453635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0454029Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0454432Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0454658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0454887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0455151Z [1669168825.719962] [d8f8c46cdf70:33540:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0455384Z [1669168826.486243] [d8f8c46cdf70:33540:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0455623Z [1669168826.486243] [d8f8c46cdf70:33540:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0455898Z [1669168825.697452] [d8f8c46cdf70:33539:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0456127Z [1669168826.487262] [d8f8c46cdf70:33539:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0456364Z [1669168826.487262] [d8f8c46cdf70:33539:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0456465Z ok (5.525s) 2022-11-23T02:07:37.0456486Z 2022-11-23T02:07:37.0456750Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0456856Z Ran 1 test in 5.526s 2022-11-23T02:07:37.0456876Z 2022-11-23T02:07:37.0456954Z OK 2022-11-23T02:07:37.0456982Z 2022-11-23T02:07:37.0457090Z Generating XML reports... 2022-11-23T02:07:37.0457538Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020021.xml 2022-11-23T02:07:37.0457968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0458141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0458517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0458708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0458728Z 2022-11-23T02:07:37.0458832Z Running tests... 2022-11-23T02:07:37.0459091Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0459389Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0459654Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0459867Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33649 2022-11-23T02:07:37.0460085Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33650 2022-11-23T02:07:37.0460456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0460629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0461056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0461252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0461607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0461778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0462149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0462342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0462581Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0462823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0463220Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0463614Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0463843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0464057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0464397Z STAGE:2022-11-23 02:00:33 33649:33649 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0464730Z STAGE:2022-11-23 02:00:33 33650:33650 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0465006Z [1669168833.778968] [d8f8c46cdf70:33649:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0465236Z [1669168834.833799] [d8f8c46cdf70:33649:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0465476Z [1669168834.833799] [d8f8c46cdf70:33649:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0465810Z STAGE:2022-11-23 02:00:35 33649:33649 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0466085Z [1669168833.800386] [d8f8c46cdf70:33650:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0466311Z [1669168834.832646] [d8f8c46cdf70:33650:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0466603Z [1669168834.832646] [d8f8c46cdf70:33650:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0466932Z STAGE:2022-11-23 02:00:35 33650:33650 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0467284Z STAGE:2022-11-23 02:00:35 33649:33649 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0467629Z STAGE:2022-11-23 02:00:35 33650:33650 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0467728Z ok (5.860s) 2022-11-23T02:07:37.0467749Z 2022-11-23T02:07:37.0468012Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0468123Z Ran 1 test in 5.860s 2022-11-23T02:07:37.0468143Z 2022-11-23T02:07:37.0468232Z OK 2022-11-23T02:07:37.0468251Z 2022-11-23T02:07:37.0468372Z Generating XML reports... 2022-11-23T02:07:37.0468821Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020029.xml 2022-11-23T02:07:37.0469181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0469359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0469781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0469975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0469995Z 2022-11-23T02:07:37.0470096Z Running tests... 2022-11-23T02:07:37.0470359Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0470669Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0470928Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0471137Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33763 2022-11-23T02:07:37.0471346Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33764 2022-11-23T02:07:37.0471718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0471894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0472278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0472469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0472833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0473004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0473380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0473559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0473799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0474039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0474435Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0474832Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0475261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0475611Z STAGE:2022-11-23 02:00:42 33763:33763 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0475974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0476310Z STAGE:2022-11-23 02:00:42 33764:33764 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0476581Z [1669168842.318809] [d8f8c46cdf70:33763:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0476817Z [1669168843.354283] [d8f8c46cdf70:33763:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0477058Z [1669168843.354283] [d8f8c46cdf70:33763:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0477401Z STAGE:2022-11-23 02:00:43 33763:33763 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0477679Z [1669168842.339649] [d8f8c46cdf70:33764:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0477913Z [1669168843.391380] [d8f8c46cdf70:33764:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0478147Z [1669168843.391380] [d8f8c46cdf70:33764:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0478548Z STAGE:2022-11-23 02:00:43 33764:33764 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0478913Z STAGE:2022-11-23 02:00:43 33763:33763 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0479249Z STAGE:2022-11-23 02:00:43 33764:33764 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0479347Z ok (5.973s) 2022-11-23T02:07:37.0479368Z 2022-11-23T02:07:37.0479638Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0479749Z Ran 1 test in 5.974s 2022-11-23T02:07:37.0479768Z 2022-11-23T02:07:37.0479860Z OK 2022-11-23T02:07:37.0479879Z 2022-11-23T02:07:37.0479997Z Generating XML reports... 2022-11-23T02:07:37.0480443Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020038.xml 2022-11-23T02:07:37.0480808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0480985Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0481391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0481579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0481598Z 2022-11-23T02:07:37.0481702Z Running tests... 2022-11-23T02:07:37.0481965Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0482275Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0482563Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0482774Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33877 2022-11-23T02:07:37.0482989Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33878 2022-11-23T02:07:37.0483353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0483524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0483901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0484085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0484449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0484685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0485066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0485253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0485499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0485727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0486129Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0486526Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0486753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0486993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0487214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0487446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0487891Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0488299Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0488524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:37.0488764Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:37.0489161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0489556Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0489785Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T02:07:37.0489885Z ok (21.793s) 2022-11-23T02:07:37.0489906Z 2022-11-23T02:07:37.0490170Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0490281Z Ran 1 test in 21.793s 2022-11-23T02:07:37.0490300Z 2022-11-23T02:07:37.0490385Z OK 2022-11-23T02:07:37.0490404Z 2022-11-23T02:07:37.0490513Z Generating XML reports... 2022-11-23T02:07:37.0490958Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020046.xml 2022-11-23T02:07:37.0491323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0491498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0491878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0492068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0492087Z 2022-11-23T02:07:37.0492189Z Running tests... 2022-11-23T02:07:37.0492450Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0492747Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0493045Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0493260Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33998 2022-11-23T02:07:37.0493473Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33999 2022-11-23T02:07:37.0493909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0494083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0494463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0494655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0495021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0495180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0495551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0495736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0495982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0496228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0496627Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0497086Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0497321Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0497558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0497769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0498000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0498404Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0498796Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0499038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:37.0499279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:37.0499664Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0500049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0500281Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T02:07:37.0500494Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T02:07:37.0500595Z ok (21.400s) 2022-11-23T02:07:37.0500615Z 2022-11-23T02:07:37.0500875Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0500985Z Ran 1 test in 21.400s 2022-11-23T02:07:37.0501005Z 2022-11-23T02:07:37.0501094Z OK 2022-11-23T02:07:37.0501113Z 2022-11-23T02:07:37.0501237Z Generating XML reports... 2022-11-23T02:07:37.0501689Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020111.xml 2022-11-23T02:07:37.0502055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0502231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0502596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0502785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0502867Z 2022-11-23T02:07:37.0502979Z Running tests... 2022-11-23T02:07:37.0503244Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0503554Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0503964Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:37.0503985Z 2022-11-23T02:07:37.0504247Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0504353Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0504372Z 2022-11-23T02:07:37.0504463Z OK (skipped=1) 2022-11-23T02:07:37.0504492Z 2022-11-23T02:07:37.0504600Z Generating XML reports... 2022-11-23T02:07:37.0505046Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020135.xml 2022-11-23T02:07:37.0505421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0505593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0505969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0506207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0506229Z 2022-11-23T02:07:37.0506337Z Running tests... 2022-11-23T02:07:37.0506597Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0506891Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0507275Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.003s) 2022-11-23T02:07:37.0507294Z 2022-11-23T02:07:37.0507551Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0507663Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0507682Z 2022-11-23T02:07:37.0507781Z OK (skipped=1) 2022-11-23T02:07:37.0507800Z 2022-11-23T02:07:37.0507920Z Generating XML reports... 2022-11-23T02:07:37.0508365Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020137.xml 2022-11-23T02:07:37.0508737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0508910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0509277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0509464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0509483Z 2022-11-23T02:07:37.0509585Z Running tests... 2022-11-23T02:07:37.0509840Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0510153Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0510572Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:37.0510592Z 2022-11-23T02:07:37.0510851Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0510961Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0510981Z 2022-11-23T02:07:37.0511084Z OK (skipped=1) 2022-11-23T02:07:37.0511104Z 2022-11-23T02:07:37.0511211Z Generating XML reports... 2022-11-23T02:07:37.0511651Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020140.xml 2022-11-23T02:07:37.0512024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0512194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0512630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0512814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0512833Z 2022-11-23T02:07:37.0512933Z Running tests... 2022-11-23T02:07:37.0513189Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0513483Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0513893Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:37.0513913Z 2022-11-23T02:07:37.0514169Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0514275Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0514294Z 2022-11-23T02:07:37.0514396Z OK (skipped=1) 2022-11-23T02:07:37.0514418Z 2022-11-23T02:07:37.0514538Z Generating XML reports... 2022-11-23T02:07:37.0514982Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020142.xml 2022-11-23T02:07:37.0515655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0515905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0516287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0516476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0516497Z 2022-11-23T02:07:37.0516597Z Running tests... 2022-11-23T02:07:37.0516863Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0517172Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0517582Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:37.0517603Z 2022-11-23T02:07:37.0517855Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0517963Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0517982Z 2022-11-23T02:07:37.0518082Z OK (skipped=1) 2022-11-23T02:07:37.0518101Z 2022-11-23T02:07:37.0518212Z Generating XML reports... 2022-11-23T02:07:37.0518656Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020144.xml 2022-11-23T02:07:37.0519025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0519197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0519576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0519765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0519784Z 2022-11-23T02:07:37.0519888Z Running tests... 2022-11-23T02:07:37.0520146Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0520450Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0520843Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T02:07:37.0520864Z 2022-11-23T02:07:37.0521122Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0521229Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0521248Z 2022-11-23T02:07:37.0521351Z OK (skipped=1) 2022-11-23T02:07:37.0521370Z 2022-11-23T02:07:37.0521489Z Generating XML reports... 2022-11-23T02:07:37.0521928Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020147.xml 2022-11-23T02:07:37.0522381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0522554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0522929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0523111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0523131Z 2022-11-23T02:07:37.0523235Z Running tests... 2022-11-23T02:07:37.0523495Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0523797Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0524198Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T02:07:37.0524217Z 2022-11-23T02:07:37.0524480Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0524586Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0524606Z 2022-11-23T02:07:37.0524705Z OK (skipped=1) 2022-11-23T02:07:37.0524724Z 2022-11-23T02:07:37.0524832Z Generating XML reports... 2022-11-23T02:07:37.0525324Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020149.xml 2022-11-23T02:07:37.0525702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0525877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0526253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0526440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0526460Z 2022-11-23T02:07:37.0526564Z Running tests... 2022-11-23T02:07:37.0526829Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0527137Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0527523Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T02:07:37.0527555Z 2022-11-23T02:07:37.0527805Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0527915Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0527934Z 2022-11-23T02:07:37.0528040Z OK (skipped=1) 2022-11-23T02:07:37.0528059Z 2022-11-23T02:07:37.0528180Z Generating XML reports... 2022-11-23T02:07:37.0528625Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020152.xml 2022-11-23T02:07:37.0528993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0529167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0529546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0529725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0529745Z 2022-11-23T02:07:37.0529848Z Running tests... 2022-11-23T02:07:37.0530111Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0530419Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0530811Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T02:07:37.0530831Z 2022-11-23T02:07:37.0531087Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0531197Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0531216Z 2022-11-23T02:07:37.0531378Z OK (skipped=1) 2022-11-23T02:07:37.0531398Z 2022-11-23T02:07:37.0531517Z Generating XML reports... 2022-11-23T02:07:37.0531949Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020154.xml 2022-11-23T02:07:37.0532322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0532496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0532872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0533057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0533077Z 2022-11-23T02:07:37.0533178Z Running tests... 2022-11-23T02:07:37.0533443Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0533747Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0534034Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL backend supports high priority stream (0.002s) 2022-11-23T02:07:37.0534065Z 2022-11-23T02:07:37.0534311Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0534417Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0534436Z 2022-11-23T02:07:37.0534535Z OK (skipped=1) 2022-11-23T02:07:37.0534606Z 2022-11-23T02:07:37.0534731Z Generating XML reports... 2022-11-23T02:07:37.0535176Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020156.xml 2022-11-23T02:07:37.0535543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0535718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0536092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0536277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0536307Z 2022-11-23T02:07:37.0536399Z Running tests... 2022-11-23T02:07:37.0536654Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0536966Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0537219Z test_new_subgroups (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:07:37.0537239Z 2022-11-23T02:07:37.0537497Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0537607Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0537626Z 2022-11-23T02:07:37.0537727Z OK (skipped=1) 2022-11-23T02:07:37.0537747Z 2022-11-23T02:07:37.0537862Z Generating XML reports... 2022-11-23T02:07:37.0538292Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020159.xml 2022-11-23T02:07:37.0538668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0538842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0539222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0539415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0539435Z 2022-11-23T02:07:37.0539539Z Running tests... 2022-11-23T02:07:37.0539801Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0540105Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0540362Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.003s) 2022-11-23T02:07:37.0540389Z 2022-11-23T02:07:37.0540695Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0540807Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0540827Z 2022-11-23T02:07:37.0540928Z OK (skipped=1) 2022-11-23T02:07:37.0540948Z 2022-11-23T02:07:37.0541065Z Generating XML reports... 2022-11-23T02:07:37.0541513Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020201.xml 2022-11-23T02:07:37.0541884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0542057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0542434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0542613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0542644Z 2022-11-23T02:07:37.0542735Z Running tests... 2022-11-23T02:07:37.0542995Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0543304Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0543608Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:07:37.0543628Z 2022-11-23T02:07:37.0543931Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0544046Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0544067Z 2022-11-23T02:07:37.0544171Z OK (skipped=1) 2022-11-23T02:07:37.0544190Z 2022-11-23T02:07:37.0544303Z Generating XML reports... 2022-11-23T02:07:37.0544733Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020204.xml 2022-11-23T02:07:37.0545101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0545277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0545652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0545843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0545862Z 2022-11-23T02:07:37.0545965Z Running tests... 2022-11-23T02:07:37.0546229Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0546537Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0546837Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0547042Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34548 2022-11-23T02:07:37.0547258Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34549 2022-11-23T02:07:37.0547632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0547806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0548181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0548370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0548729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0548898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0549259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0549446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0549687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0549987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0550388Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0550788Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0551013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0551233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0551329Z ok (4.236s) 2022-11-23T02:07:37.0551350Z 2022-11-23T02:07:37.0551602Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0551707Z Ran 1 test in 4.236s 2022-11-23T02:07:37.0551727Z 2022-11-23T02:07:37.0551815Z OK 2022-11-23T02:07:37.0551834Z 2022-11-23T02:07:37.0551954Z Generating XML reports... 2022-11-23T02:07:37.0552398Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020206.xml 2022-11-23T02:07:37.0552762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0552983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0553371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0553555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0553575Z 2022-11-23T02:07:37.0553667Z Running tests... 2022-11-23T02:07:37.0553926Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0554232Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0554529Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0554741Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34651 2022-11-23T02:07:37.0554952Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34652 2022-11-23T02:07:37.0555551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0555721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0556086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0556278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0556640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0556817Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0557188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0557369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0557615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0557858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0558256Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0558642Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0558869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0559182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0559284Z ok (4.337s) 2022-11-23T02:07:37.0559304Z 2022-11-23T02:07:37.0559563Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0559669Z Ran 1 test in 4.337s 2022-11-23T02:07:37.0559688Z 2022-11-23T02:07:37.0559773Z OK 2022-11-23T02:07:37.0559792Z 2022-11-23T02:07:37.0559912Z Generating XML reports... 2022-11-23T02:07:37.0560349Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020213.xml 2022-11-23T02:07:37.0560719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0560890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0561265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0561454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0561474Z 2022-11-23T02:07:37.0561576Z Running tests... 2022-11-23T02:07:37.0561838Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0562147Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0562501Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:07:37.0562524Z 2022-11-23T02:07:37.0562779Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0562887Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0562906Z 2022-11-23T02:07:37.0563009Z OK (skipped=1) 2022-11-23T02:07:37.0563028Z 2022-11-23T02:07:37.0563146Z Generating XML reports... 2022-11-23T02:07:37.0563597Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020220.xml 2022-11-23T02:07:37.0563968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0564139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0564515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0564709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0564729Z 2022-11-23T02:07:37.0564824Z Running tests... 2022-11-23T02:07:37.0565084Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0565393Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0565687Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:07:37.0565707Z 2022-11-23T02:07:37.0565968Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0566077Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0566097Z 2022-11-23T02:07:37.0566204Z OK (skipped=1) 2022-11-23T02:07:37.0566223Z 2022-11-23T02:07:37.0566340Z Generating XML reports... 2022-11-23T02:07:37.0566784Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020222.xml 2022-11-23T02:07:37.0567146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0567316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0567691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0567877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0567896Z 2022-11-23T02:07:37.0567996Z Running tests... 2022-11-23T02:07:37.0568311Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0568620Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0568898Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0569652Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78112 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.665s) 2022-11-23T02:07:37.0569674Z 2022-11-23T02:07:37.0569921Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0570031Z Ran 1 test in 1.666s 2022-11-23T02:07:37.0570051Z 2022-11-23T02:07:37.0570150Z OK (skipped=1) 2022-11-23T02:07:37.0570168Z 2022-11-23T02:07:37.0570289Z Generating XML reports... 2022-11-23T02:07:37.0570730Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020225.xml 2022-11-23T02:07:37.0571100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0571272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0571697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0571896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0571916Z 2022-11-23T02:07:37.0572008Z Running tests... 2022-11-23T02:07:37.0572268Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0572570Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0572848Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0573072Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34854 2022-11-23T02:07:37.0573285Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34855 2022-11-23T02:07:37.0573661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0573834Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0574201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0574496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0574867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0575039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0575412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0575596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0575843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0576087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0576489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0576873Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0577097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0577319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0577633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8ri662h5 2022-11-23T02:07:37.0577901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8ri662h5/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0578150Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl8lrsyzj 2022-11-23T02:07:37.0578420Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl8lrsyzj/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0578695Z [1669168953.944355] [d8f8c46cdf70:34854:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0578931Z [1669168953.952007] [d8f8c46cdf70:34854:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0579158Z [1669168953.952007] [d8f8c46cdf70:34854:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0579433Z [1669168953.950232] [d8f8c46cdf70:34855:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0579664Z [1669168953.955978] [d8f8c46cdf70:34855:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0579942Z [1669168953.955978] [d8f8c46cdf70:34855:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0580050Z ok (6.060s) 2022-11-23T02:07:37.0580070Z 2022-11-23T02:07:37.0580332Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0580445Z Ran 1 test in 6.060s 2022-11-23T02:07:37.0580464Z 2022-11-23T02:07:37.0580555Z OK 2022-11-23T02:07:37.0580574Z 2022-11-23T02:07:37.0580693Z Generating XML reports... 2022-11-23T02:07:37.0581165Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020229.xml 2022-11-23T02:07:37.0581553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0581731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0582113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0582300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0582320Z 2022-11-23T02:07:37.0582425Z Running tests... 2022-11-23T02:07:37.0582685Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0582990Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0583258Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0583464Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34972 2022-11-23T02:07:37.0583683Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34973 2022-11-23T02:07:37.0584056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0584224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0584605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0584790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0585157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0585332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0585703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0585940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0586182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0586421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0586828Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0587223Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0587445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0587668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0587942Z [1669168962.951800] [d8f8c46cdf70:34972:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0588222Z [1669168962.961072] [d8f8c46cdf70:34973:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0588442Z [1669168962.957824] [d8f8c46cdf70:34972:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0588727Z [1669168962.957824] [d8f8c46cdf70:34972:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0588961Z [1669168962.966028] [d8f8c46cdf70:34973:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0589197Z [1669168962.966028] [d8f8c46cdf70:34973:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0589295Z ok (6.031s) 2022-11-23T02:07:37.0589316Z 2022-11-23T02:07:37.0589579Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0589692Z Ran 1 test in 6.032s 2022-11-23T02:07:37.0589711Z 2022-11-23T02:07:37.0589802Z OK 2022-11-23T02:07:37.0589821Z 2022-11-23T02:07:37.0589943Z Generating XML reports... 2022-11-23T02:07:37.0590379Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020237.xml 2022-11-23T02:07:37.0590752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0590929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0591310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0591498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0591518Z 2022-11-23T02:07:37.0591621Z Running tests... 2022-11-23T02:07:37.0591882Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0592193Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0592474Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0592682Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35087 2022-11-23T02:07:37.0592896Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35088 2022-11-23T02:07:37.0593265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0593437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0593812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0593998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0594428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0594599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0594960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0595368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0595623Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0595866Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0596271Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0596667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0596898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0597124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0597472Z [1669168971.568101] [d8f8c46cdf70:35087:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0597747Z [1669168971.577334] [d8f8c46cdf70:35088:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0597979Z [1669168971.574060] [d8f8c46cdf70:35087:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0598217Z [1669168971.574060] [d8f8c46cdf70:35087:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0598442Z [1669168971.582391] [d8f8c46cdf70:35088:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0598672Z [1669168971.582391] [d8f8c46cdf70:35088:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0598768Z ok (6.031s) 2022-11-23T02:07:37.0598789Z 2022-11-23T02:07:37.0599054Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0599170Z Ran 1 test in 6.032s 2022-11-23T02:07:37.0599189Z 2022-11-23T02:07:37.0599280Z OK 2022-11-23T02:07:37.0599299Z 2022-11-23T02:07:37.0599406Z Generating XML reports... 2022-11-23T02:07:37.0599851Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020246.xml 2022-11-23T02:07:37.0600223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0600400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0600786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0600978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0600998Z 2022-11-23T02:07:37.0601105Z Running tests... 2022-11-23T02:07:37.0601365Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0601679Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0601944Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0602684Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77123 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.653s) 2022-11-23T02:07:37.0602783Z 2022-11-23T02:07:37.0603038Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0603145Z Ran 1 test in 1.654s 2022-11-23T02:07:37.0603164Z 2022-11-23T02:07:37.0603270Z OK (skipped=1) 2022-11-23T02:07:37.0603289Z 2022-11-23T02:07:37.0603407Z Generating XML reports... 2022-11-23T02:07:37.0603854Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020255.xml 2022-11-23T02:07:37.0604225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0604400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0604782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0604961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0604980Z 2022-11-23T02:07:37.0605093Z Running tests... 2022-11-23T02:07:37.0605354Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0605659Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0605950Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0606743Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77292 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.638s) 2022-11-23T02:07:37.0606766Z 2022-11-23T02:07:37.0607032Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0607139Z Ran 1 test in 1.638s 2022-11-23T02:07:37.0607158Z 2022-11-23T02:07:37.0607257Z OK (skipped=1) 2022-11-23T02:07:37.0607276Z 2022-11-23T02:07:37.0607401Z Generating XML reports... 2022-11-23T02:07:37.0607837Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020259.xml 2022-11-23T02:07:37.0608208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0608388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0608768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0608958Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0608978Z 2022-11-23T02:07:37.0609080Z Running tests... 2022-11-23T02:07:37.0609343Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0609655Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0609953Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0610180Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35270 2022-11-23T02:07:37.0610395Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35271 2022-11-23T02:07:37.0610769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0610945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0611324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0611507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0611872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0612042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0612460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0612649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0612893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0613138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0613537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0613932Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0614160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0614383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0614523Z skip: Need at least 4 CUDA devices (4.231s) 2022-11-23T02:07:37.0614554Z 2022-11-23T02:07:37.0614805Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0614917Z Ran 1 test in 4.231s 2022-11-23T02:07:37.0614937Z 2022-11-23T02:07:37.0615046Z OK (skipped=1) 2022-11-23T02:07:37.0615065Z 2022-11-23T02:07:37.0615282Z Generating XML reports... 2022-11-23T02:07:37.0615740Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020303.xml 2022-11-23T02:07:37.0616112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0616286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0616659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0616843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0616870Z 2022-11-23T02:07:37.0616963Z Running tests... 2022-11-23T02:07:37.0617222Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0617528Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0617858Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0618073Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35373 2022-11-23T02:07:37.0618285Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35374 2022-11-23T02:07:37.0618648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0618818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0619189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0619381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0619749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0619921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0620294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0620482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0620725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0620966Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0621418Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0621813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0622041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0622270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0622417Z skip: Need at least 4 CUDA devices (4.225s) 2022-11-23T02:07:37.0622436Z 2022-11-23T02:07:37.0622700Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0622810Z Ran 1 test in 4.225s 2022-11-23T02:07:37.0622830Z 2022-11-23T02:07:37.0622935Z OK (skipped=1) 2022-11-23T02:07:37.0622954Z 2022-11-23T02:07:37.0623074Z Generating XML reports... 2022-11-23T02:07:37.0623507Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020310.xml 2022-11-23T02:07:37.0623882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0624052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0624496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0624690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0624709Z 2022-11-23T02:07:37.0624813Z Running tests... 2022-11-23T02:07:37.0625077Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0625387Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0625668Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0626406Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/84886 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.660s) 2022-11-23T02:07:37.0626436Z 2022-11-23T02:07:37.0626685Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0626796Z Ran 1 test in 1.661s 2022-11-23T02:07:37.0626815Z 2022-11-23T02:07:37.0626917Z OK (skipped=1) 2022-11-23T02:07:37.0626936Z 2022-11-23T02:07:37.0627052Z Generating XML reports... 2022-11-23T02:07:37.0627501Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020316.xml 2022-11-23T02:07:37.0627874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0628048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0628433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0628610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0628646Z 2022-11-23T02:07:37.0628738Z Running tests... 2022-11-23T02:07:37.0629000Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0629307Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0629567Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0629782Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35510 2022-11-23T02:07:37.0629995Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35511 2022-11-23T02:07:37.0630366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0630599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0630971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0631159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0631528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0631699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0632067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0632250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0632490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0632736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0633124Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0633524Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0633800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0634050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0634269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0634504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0634900Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0635509Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0635852Z STAGE:2022-11-23 02:03:24 35511:35511 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0636191Z STAGE:2022-11-23 02:03:24 35510:35510 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0636461Z [1669169005.018380] [d8f8c46cdf70:35511:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0636700Z [1669169006.053898] [d8f8c46cdf70:35511:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0636942Z [1669169006.053898] [d8f8c46cdf70:35511:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0637213Z [1669169004.996852] [d8f8c46cdf70:35510:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0637449Z [1669169006.026148] [d8f8c46cdf70:35510:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0637685Z [1669169006.026148] [d8f8c46cdf70:35510:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0638236Z STAGE:2022-11-23 02:03:26 35511:35511 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:03:26 35510:35510 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0638258Z 2022-11-23T02:07:37.0638610Z STAGE:2022-11-23 02:03:26 35510:35510 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0638957Z STAGE:2022-11-23 02:03:26 35511:35511 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0639479Z STAGE:2022-11-23 02:03:26 35511:35511 ActivityProfilerController.cpp:300] Completed Stage: Warm UpSTAGE:2022-11-23 02:03:26 35510:35510 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0639577Z 2022-11-23T02:07:37.0640132Z STAGE:2022-11-23 02:03:26 35510:35510 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:03:26 35511:35511 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0640153Z 2022-11-23T02:07:37.0640496Z STAGE:2022-11-23 02:03:26 35510:35510 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0640825Z STAGE:2022-11-23 02:03:26 35511:35511 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0640923Z ok (5.839s) 2022-11-23T02:07:37.0640942Z 2022-11-23T02:07:37.0641207Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0641315Z Ran 1 test in 5.840s 2022-11-23T02:07:37.0641334Z 2022-11-23T02:07:37.0641425Z OK 2022-11-23T02:07:37.0641444Z 2022-11-23T02:07:37.0641566Z Generating XML reports... 2022-11-23T02:07:37.0642015Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020321.xml 2022-11-23T02:07:37.0642386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0642623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0643003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0643191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0643210Z 2022-11-23T02:07:37.0643314Z Running tests... 2022-11-23T02:07:37.0643574Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0643883Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0644149Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0644370Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35624 2022-11-23T02:07:37.0644583Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35625 2022-11-23T02:07:37.0644947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0645125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0645502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0645688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0646052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0646227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0646597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0646782Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0647018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0647258Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0647660Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0648055Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0648283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0648587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0648808Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0649043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0649446Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0649835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0650157Z STAGE:2022-11-23 02:03:33 35624:35624 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0650479Z STAGE:2022-11-23 02:03:33 35625:35625 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0650758Z [1669169013.475200] [d8f8c46cdf70:35625:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0650999Z [1669169014.524660] [d8f8c46cdf70:35625:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0651241Z [1669169014.524660] [d8f8c46cdf70:35625:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0651564Z [1669169013.475092] [d8f8c46cdf70:35624:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0651805Z [1669169014.561928] [d8f8c46cdf70:35624:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0652042Z [1669169014.561928] [d8f8c46cdf70:35624:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0652599Z STAGE:2022-11-23 02:03:34 35625:35625 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:03:34 35624:35624 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0652623Z 2022-11-23T02:07:37.0653202Z STAGE:2022-11-23 02:03:34 35624:35624 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 02:03:34 35625:35625 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0653222Z 2022-11-23T02:07:37.0653555Z STAGE:2022-11-23 02:03:34 35624:35624 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0653865Z STAGE:2022-11-23 02:03:34 35625:35625 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0654198Z STAGE:2022-11-23 02:03:34 35624:35624 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0654544Z STAGE:2022-11-23 02:03:34 35624:35624 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0654880Z STAGE:2022-11-23 02:03:34 35625:35625 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0655229Z STAGE:2022-11-23 02:03:34 35625:35625 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0655328Z ok (6.066s) 2022-11-23T02:07:37.0655347Z 2022-11-23T02:07:37.0655611Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0655722Z Ran 1 test in 6.066s 2022-11-23T02:07:37.0655745Z 2022-11-23T02:07:37.0655836Z OK 2022-11-23T02:07:37.0655855Z 2022-11-23T02:07:37.0655963Z Generating XML reports... 2022-11-23T02:07:37.0656409Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020329.xml 2022-11-23T02:07:37.0656781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0656956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0657335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0657580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0657600Z 2022-11-23T02:07:37.0657702Z Running tests... 2022-11-23T02:07:37.0657961Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0658264Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0658531Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0658747Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35738 2022-11-23T02:07:37.0658962Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35739 2022-11-23T02:07:37.0659331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0659504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0659879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0660064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0660474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0660641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0661013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0661194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0661435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0661668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0662076Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0662471Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0662698Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0662942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0663157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0663391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0663791Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0664184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0664527Z STAGE:2022-11-23 02:03:42 35738:35738 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0664851Z STAGE:2022-11-23 02:03:42 35739:35739 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0665136Z [1669169022.130944] [d8f8c46cdf70:35739:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0665372Z [1669169023.155128] [d8f8c46cdf70:35739:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0665616Z [1669169023.155128] [d8f8c46cdf70:35739:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0665880Z [1669169022.108666] [d8f8c46cdf70:35738:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0666167Z [1669169023.143896] [d8f8c46cdf70:35738:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0666406Z [1669169023.143896] [d8f8c46cdf70:35738:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0666972Z STAGE:2022-11-23 02:03:43 35739:35739 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:03:43 35738:35738 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0666993Z 2022-11-23T02:07:37.0667565Z STAGE:2022-11-23 02:03:43 35738:35738 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 02:03:43 35739:35739 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0667586Z 2022-11-23T02:07:37.0667912Z STAGE:2022-11-23 02:03:43 35738:35738 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0668244Z STAGE:2022-11-23 02:03:43 35739:35739 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0668575Z STAGE:2022-11-23 02:03:43 35738:35738 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0668917Z STAGE:2022-11-23 02:03:43 35738:35738 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0669296Z STAGE:2022-11-23 02:03:43 35739:35739 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0669655Z STAGE:2022-11-23 02:03:43 35739:35739 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0669742Z ok (6.063s) 2022-11-23T02:07:37.0669770Z 2022-11-23T02:07:37.0670024Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0670133Z Ran 1 test in 6.064s 2022-11-23T02:07:37.0670152Z 2022-11-23T02:07:37.0670240Z OK 2022-11-23T02:07:37.0670259Z 2022-11-23T02:07:37.0670383Z Generating XML reports... 2022-11-23T02:07:37.0670832Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020338.xml 2022-11-23T02:07:37.0671208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0671379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0671761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0671941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0671961Z 2022-11-23T02:07:37.0672110Z Running tests... 2022-11-23T02:07:37.0672377Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0672689Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0672947Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0673170Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35852 2022-11-23T02:07:37.0673387Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35853 2022-11-23T02:07:37.0673757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0673921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0674298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0674486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0674848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0675200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0675688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0675879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0676124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0676375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0676765Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0677161Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0677387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0677625Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0677850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0678086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0678484Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0678937Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0679285Z STAGE:2022-11-23 02:03:50 35852:35852 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0679597Z STAGE:2022-11-23 02:03:50 35853:35853 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0679879Z [1669169030.637412] [d8f8c46cdf70:35852:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0680117Z [1669169031.671336] [d8f8c46cdf70:35852:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0680358Z [1669169031.671336] [d8f8c46cdf70:35852:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0680635Z [1669169030.637437] [d8f8c46cdf70:35853:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0680861Z [1669169031.672220] [d8f8c46cdf70:35853:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0681097Z [1669169031.672220] [d8f8c46cdf70:35853:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0681685Z STAGE:2022-11-23 02:03:52 35852:35852 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:03:52 35853:35853 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0681710Z 2022-11-23T02:07:37.0682058Z STAGE:2022-11-23 02:03:52 35852:35852 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0682403Z STAGE:2022-11-23 02:03:52 35853:35853 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0682728Z STAGE:2022-11-23 02:03:52 35853:35853 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0683037Z STAGE:2022-11-23 02:03:52 35852:35852 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0683366Z STAGE:2022-11-23 02:03:52 35853:35853 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0683691Z STAGE:2022-11-23 02:03:52 35852:35852 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0684032Z STAGE:2022-11-23 02:03:52 35853:35853 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0684370Z STAGE:2022-11-23 02:03:52 35852:35852 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0684534Z ok (5.860s) 2022-11-23T02:07:37.0684555Z 2022-11-23T02:07:37.0684820Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0684931Z Ran 1 test in 5.861s 2022-11-23T02:07:37.0684950Z 2022-11-23T02:07:37.0685027Z OK 2022-11-23T02:07:37.0685059Z 2022-11-23T02:07:37.0685172Z Generating XML reports... 2022-11-23T02:07:37.0685628Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020346.xml 2022-11-23T02:07:37.0686009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0686185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0686564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0686760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0686779Z 2022-11-23T02:07:37.0686885Z Running tests... 2022-11-23T02:07:37.0687144Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0687441Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0687755Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0687984Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35966 2022-11-23T02:07:37.0688198Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35967 2022-11-23T02:07:37.0688571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0688742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0689124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0689320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0689673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0689841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0690228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0690413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0690660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0690899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0691303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0691706Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0691937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0692152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0692308Z skip: Skipped due to small world size. (4.224s) 2022-11-23T02:07:37.0692330Z 2022-11-23T02:07:37.0692592Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0692701Z Ran 1 test in 4.224s 2022-11-23T02:07:37.0692720Z 2022-11-23T02:07:37.0692827Z OK (skipped=1) 2022-11-23T02:07:37.0692845Z 2022-11-23T02:07:37.0692964Z Generating XML reports... 2022-11-23T02:07:37.0693410Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020355.xml 2022-11-23T02:07:37.0693842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0694015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0694379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0694572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0694592Z 2022-11-23T02:07:37.0694691Z Running tests... 2022-11-23T02:07:37.0694953Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0695260Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0695517Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0695730Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36069 2022-11-23T02:07:37.0695948Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36070 2022-11-23T02:07:37.0696306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0696480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0696908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0697110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0697477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0697649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0698027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0698211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0698456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0698688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0699089Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0699480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0699708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0699932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0700088Z skip: Skipped due to small world size. (4.228s) 2022-11-23T02:07:37.0700107Z 2022-11-23T02:07:37.0700373Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0700484Z Ran 1 test in 4.228s 2022-11-23T02:07:37.0700504Z 2022-11-23T02:07:37.0700605Z OK (skipped=1) 2022-11-23T02:07:37.0700625Z 2022-11-23T02:07:37.0700733Z Generating XML reports... 2022-11-23T02:07:37.0701181Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020401.xml 2022-11-23T02:07:37.0701555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0701731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0702112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0702303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0702323Z 2022-11-23T02:07:37.0702424Z Running tests... 2022-11-23T02:07:37.0702687Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0703043Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0703304Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0703522Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36172 2022-11-23T02:07:37.0703740Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36173 2022-11-23T02:07:37.0704111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0704285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0704663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0704849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0705216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0705374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0705745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0705978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0706226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0706463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0706861Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0707250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0707480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0707701Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0707847Z skip: Skipped due to small world size. (4.268s) 2022-11-23T02:07:37.0707867Z 2022-11-23T02:07:37.0708136Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0708247Z Ran 1 test in 4.268s 2022-11-23T02:07:37.0708267Z 2022-11-23T02:07:37.0708370Z OK (skipped=1) 2022-11-23T02:07:37.0708389Z 2022-11-23T02:07:37.0708504Z Generating XML reports... 2022-11-23T02:07:37.0708948Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020408.xml 2022-11-23T02:07:37.0709317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0709489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0709862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0710053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0710074Z 2022-11-23T02:07:37.0710180Z Running tests... 2022-11-23T02:07:37.0710447Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0710758Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0711013Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0711230Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36275 2022-11-23T02:07:37.0711444Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36276 2022-11-23T02:07:37.0711814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0712036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0712422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0712611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0712979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0713152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0713529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0713717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0713959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0714192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0714595Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0715328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0715582Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0715809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0715964Z skip: Skipped due to small world size. (4.219s) 2022-11-23T02:07:37.0715984Z 2022-11-23T02:07:37.0716253Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0716362Z Ran 1 test in 4.219s 2022-11-23T02:07:37.0716382Z 2022-11-23T02:07:37.0716486Z OK (skipped=1) 2022-11-23T02:07:37.0716509Z 2022-11-23T02:07:37.0716618Z Generating XML reports... 2022-11-23T02:07:37.0717068Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020415.xml 2022-11-23T02:07:37.0717437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0717616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0717999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0718187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0718206Z 2022-11-23T02:07:37.0718309Z Running tests... 2022-11-23T02:07:37.0718570Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0718865Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0719115Z test_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0719329Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36378 2022-11-23T02:07:37.0719543Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36379 2022-11-23T02:07:37.0719914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0720087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0720459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0720646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0721001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0721161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0721624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0721811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0722055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0722304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0722705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0723102Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0723331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0723671Z STAGE:2022-11-23 02:04:26 36379:36379 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0723890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0724215Z STAGE:2022-11-23 02:04:26 36378:36378 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0724560Z [1669169066.164422] [d8f8c46cdf70:36378:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0724809Z [1669169067.191114] [d8f8c46cdf70:36378:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0725051Z [1669169067.191114] [d8f8c46cdf70:36378:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0725326Z [1669169066.184745] [d8f8c46cdf70:36379:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0725561Z [1669169067.212815] [d8f8c46cdf70:36379:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0725801Z [1669169067.212815] [d8f8c46cdf70:36379:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0726364Z STAGE:2022-11-23 02:04:27 36378:36378 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:04:27 36379:36379 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0726386Z 2022-11-23T02:07:37.0726740Z STAGE:2022-11-23 02:04:27 36379:36379 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0727090Z STAGE:2022-11-23 02:04:27 36378:36378 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0727405Z STAGE:2022-11-23 02:04:27 36379:36379 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0727730Z STAGE:2022-11-23 02:04:27 36378:36378 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0728069Z STAGE:2022-11-23 02:04:27 36379:36379 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0728421Z STAGE:2022-11-23 02:04:27 36379:36379 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0728760Z STAGE:2022-11-23 02:04:27 36378:36378 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0729109Z STAGE:2022-11-23 02:04:27 36378:36378 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0729208Z ok (5.827s) 2022-11-23T02:07:37.0729227Z 2022-11-23T02:07:37.0729491Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0729587Z Ran 1 test in 5.827s 2022-11-23T02:07:37.0729618Z 2022-11-23T02:07:37.0729696Z OK 2022-11-23T02:07:37.0729715Z 2022-11-23T02:07:37.0729837Z Generating XML reports... 2022-11-23T02:07:37.0730291Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020422.xml 2022-11-23T02:07:37.0730732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0730907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0731291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0731484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0731504Z 2022-11-23T02:07:37.0731608Z Running tests... 2022-11-23T02:07:37.0731856Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0732164Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0732416Z test_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0732638Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36492 2022-11-23T02:07:37.0732857Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36493 2022-11-23T02:07:37.0733227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0733449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0733837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0734012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0734379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0734549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0734920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0735110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0735359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0735606Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0736009Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0736411Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0736629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0736964Z STAGE:2022-11-23 02:04:34 36492:36492 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0737179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0737521Z STAGE:2022-11-23 02:04:34 36493:36493 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0737798Z [1669169074.527769] [d8f8c46cdf70:36492:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0738037Z [1669169075.562875] [d8f8c46cdf70:36492:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0738279Z [1669169075.562875] [d8f8c46cdf70:36492:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0738554Z [1669169074.548271] [d8f8c46cdf70:36493:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0738789Z [1669169075.576376] [d8f8c46cdf70:36493:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0739083Z [1669169075.576376] [d8f8c46cdf70:36493:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0739624Z STAGE:2022-11-23 02:04:35 36492:36492 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:04:35 36493:36493 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0739665Z 2022-11-23T02:07:37.0740224Z STAGE:2022-11-23 02:04:35 36492:36492 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 02:04:35 36493:36493 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0740258Z 2022-11-23T02:07:37.0740573Z STAGE:2022-11-23 02:04:35 36493:36493 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0740901Z STAGE:2022-11-23 02:04:35 36492:36492 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0741235Z STAGE:2022-11-23 02:04:35 36493:36493 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0741570Z STAGE:2022-11-23 02:04:35 36492:36492 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0741917Z STAGE:2022-11-23 02:04:35 36493:36493 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0742313Z STAGE:2022-11-23 02:04:35 36492:36492 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0742424Z ok (5.916s) 2022-11-23T02:07:37.0742444Z 2022-11-23T02:07:37.0742711Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0742808Z Ran 1 test in 5.916s 2022-11-23T02:07:37.0742842Z 2022-11-23T02:07:37.0742919Z OK 2022-11-23T02:07:37.0742938Z 2022-11-23T02:07:37.0743058Z Generating XML reports... 2022-11-23T02:07:37.0743511Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020430.xml 2022-11-23T02:07:37.0743889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0744064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0744444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0744639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0744659Z 2022-11-23T02:07:37.0744765Z Running tests... 2022-11-23T02:07:37.0745017Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0745332Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0745612Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports reduce multigpu (0.002s) 2022-11-23T02:07:37.0745632Z 2022-11-23T02:07:37.0745894Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0746008Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0746027Z 2022-11-23T02:07:37.0746135Z OK (skipped=1) 2022-11-23T02:07:37.0746154Z 2022-11-23T02:07:37.0746275Z Generating XML reports... 2022-11-23T02:07:37.0746724Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020439.xml 2022-11-23T02:07:37.0747102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0747265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0747645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0747837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0747857Z 2022-11-23T02:07:37.0747964Z Running tests... 2022-11-23T02:07:37.0748226Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0748593Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0748852Z test_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0749074Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36639 2022-11-23T02:07:37.0749280Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36640 2022-11-23T02:07:37.0749655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0749830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0750215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0750406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0750784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0750957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0751332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0751579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0751824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0752065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0752468Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0752868Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0753104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0753442Z STAGE:2022-11-23 02:04:45 36640:36640 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0753666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0753997Z STAGE:2022-11-23 02:04:45 36639:36639 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0754281Z [1669169085.358517] [d8f8c46cdf70:36639:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0754505Z [1669169086.412740] [d8f8c46cdf70:36639:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0754748Z [1669169086.412740] [d8f8c46cdf70:36639:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0755232Z [1669169085.380621] [d8f8c46cdf70:36640:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0755483Z [1669169086.407409] [d8f8c46cdf70:36640:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0755728Z [1669169086.407409] [d8f8c46cdf70:36640:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0756294Z STAGE:2022-11-23 02:04:46 36639:36639 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:04:46 36640:36640 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0756315Z 2022-11-23T02:07:37.0756667Z STAGE:2022-11-23 02:04:46 36640:36640 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0757014Z STAGE:2022-11-23 02:04:46 36639:36639 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0757486Z STAGE:2022-11-23 02:04:46 36640:36640 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0757819Z STAGE:2022-11-23 02:04:46 36639:36639 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0758140Z STAGE:2022-11-23 02:04:46 36640:36640 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0758474Z STAGE:2022-11-23 02:04:46 36639:36639 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0758821Z STAGE:2022-11-23 02:04:46 36640:36640 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0759173Z STAGE:2022-11-23 02:04:46 36639:36639 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0759276Z ok (5.837s) 2022-11-23T02:07:37.0759296Z 2022-11-23T02:07:37.0759563Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0759672Z Ran 1 test in 5.837s 2022-11-23T02:07:37.0759696Z 2022-11-23T02:07:37.0759787Z OK 2022-11-23T02:07:37.0759807Z 2022-11-23T02:07:37.0759932Z Generating XML reports... 2022-11-23T02:07:37.0760367Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020441.xml 2022-11-23T02:07:37.0760801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0760987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0761373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0761566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0761586Z 2022-11-23T02:07:37.0761694Z Running tests... 2022-11-23T02:07:37.0761957Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0762269Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0762552Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce_scatter_tensor (0.002s) 2022-11-23T02:07:37.0762588Z 2022-11-23T02:07:37.0762834Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0762945Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0762966Z 2022-11-23T02:07:37.0763075Z OK (skipped=1) 2022-11-23T02:07:37.0763095Z 2022-11-23T02:07:37.0763221Z Generating XML reports... 2022-11-23T02:07:37.0763670Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020449.xml 2022-11-23T02:07:37.0764047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0764222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0764604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0764786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0764821Z 2022-11-23T02:07:37.0764914Z Running tests... 2022-11-23T02:07:37.0765174Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0765485Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0765762Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports reduce_scatter_v (0.003s) 2022-11-23T02:07:37.0765781Z 2022-11-23T02:07:37.0766042Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0766150Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0766169Z 2022-11-23T02:07:37.0766276Z OK (skipped=1) 2022-11-23T02:07:37.0766295Z 2022-11-23T02:07:37.0766419Z Generating XML reports... 2022-11-23T02:07:37.0766853Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020452.xml 2022-11-23T02:07:37.0767294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0767468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0767848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0768038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0768058Z 2022-11-23T02:07:37.0768166Z Running tests... 2022-11-23T02:07:37.0768428Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0768741Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0768991Z test_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0769200Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36819 2022-11-23T02:07:37.0769418Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36820 2022-11-23T02:07:37.0769791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0769968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0770397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0770597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0770966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0771135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0771502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0771696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0771943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0772188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0772597Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0773001Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0773233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0773571Z STAGE:2022-11-23 02:04:58 36819:36819 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0773803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0774129Z STAGE:2022-11-23 02:04:58 36820:36820 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0774412Z [1669169098.560190] [d8f8c46cdf70:36820:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0774652Z [1669169099.625730] [d8f8c46cdf70:36820:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0774897Z [1669169099.625730] [d8f8c46cdf70:36820:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0775173Z [1669169098.560164] [d8f8c46cdf70:36819:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0775404Z [1669169099.602322] [d8f8c46cdf70:36819:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0775700Z [1669169099.602322] [d8f8c46cdf70:36819:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0776260Z STAGE:2022-11-23 02:04:59 36820:36820 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:04:59 36819:36819 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0776280Z 2022-11-23T02:07:37.0776858Z STAGE:2022-11-23 02:04:59 36820:36820 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 02:04:59 36819:36819 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0776879Z 2022-11-23T02:07:37.0777212Z STAGE:2022-11-23 02:05:00 36820:36820 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0777538Z STAGE:2022-11-23 02:05:00 36819:36819 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0777858Z STAGE:2022-11-23 02:05:00 36820:36820 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0778212Z STAGE:2022-11-23 02:05:00 36820:36820 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0778550Z STAGE:2022-11-23 02:05:00 36819:36819 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0778948Z STAGE:2022-11-23 02:05:00 36819:36819 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0779058Z ok (5.871s) 2022-11-23T02:07:37.0779078Z 2022-11-23T02:07:37.0779346Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0779457Z Ran 1 test in 5.871s 2022-11-23T02:07:37.0779476Z 2022-11-23T02:07:37.0779566Z OK 2022-11-23T02:07:37.0779585Z 2022-11-23T02:07:37.0779708Z Generating XML reports... 2022-11-23T02:07:37.0780142Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020454.xml 2022-11-23T02:07:37.0780521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0780696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0781081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0781312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0781332Z 2022-11-23T02:07:37.0781440Z Running tests... 2022-11-23T02:07:37.0781707Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0782020Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0782281Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T02:07:37.0782301Z 2022-11-23T02:07:37.0782546Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0782663Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0782682Z 2022-11-23T02:07:37.0782790Z OK (skipped=1) 2022-11-23T02:07:37.0782809Z 2022-11-23T02:07:37.0782932Z Generating XML reports... 2022-11-23T02:07:37.0783378Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020503.xml 2022-11-23T02:07:37.0783756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0783930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0784314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0784492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0784526Z 2022-11-23T02:07:37.0784618Z Running tests... 2022-11-23T02:07:37.0784882Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0785259Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0785528Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T02:07:37.0785548Z 2022-11-23T02:07:37.0785809Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0785924Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0785943Z 2022-11-23T02:07:37.0786051Z OK (skipped=1) 2022-11-23T02:07:37.0786071Z 2022-11-23T02:07:37.0786192Z Generating XML reports... 2022-11-23T02:07:37.0786623Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020505.xml 2022-11-23T02:07:37.0786995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0787169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0787555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0787747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0787766Z 2022-11-23T02:07:37.0787875Z Running tests... 2022-11-23T02:07:37.0788135Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0788491Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0788761Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0788965Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36999 2022-11-23T02:07:37.0789182Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37000 2022-11-23T02:07:37.0789559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0789740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0790121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0790313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0790682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0790856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0791215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0791408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0791651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0791900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0792310Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0792711Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0792948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0793285Z STAGE:2022-11-23 02:05:11 37000:37000 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0793513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0793834Z STAGE:2022-11-23 02:05:11 36999:36999 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0794115Z [1669169111.879618] [d8f8c46cdf70:36999:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0794413Z [1669169112.934818] [d8f8c46cdf70:36999:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0794660Z [1669169112.934818] [d8f8c46cdf70:36999:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0794937Z [1669169111.900963] [d8f8c46cdf70:37000:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0795346Z [1669169112.930388] [d8f8c46cdf70:37000:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0795598Z [1669169112.930388] [d8f8c46cdf70:37000:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0796165Z STAGE:2022-11-23 02:05:13 36999:36999 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:05:13 37000:37000 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0796190Z 2022-11-23T02:07:37.0796544Z STAGE:2022-11-23 02:05:13 37000:37000 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0796890Z STAGE:2022-11-23 02:05:13 36999:36999 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0797301Z STAGE:2022-11-23 02:05:13 37000:37000 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0797625Z STAGE:2022-11-23 02:05:13 36999:36999 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0797961Z STAGE:2022-11-23 02:05:13 37000:37000 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0798289Z STAGE:2022-11-23 02:05:13 36999:36999 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0798632Z STAGE:2022-11-23 02:05:13 37000:37000 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0798985Z STAGE:2022-11-23 02:05:13 36999:36999 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0799086Z ok (5.981s) 2022-11-23T02:07:37.0799106Z 2022-11-23T02:07:37.0799372Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0799484Z Ran 1 test in 5.981s 2022-11-23T02:07:37.0799504Z 2022-11-23T02:07:37.0799582Z OK 2022-11-23T02:07:37.0799617Z 2022-11-23T02:07:37.0799728Z Generating XML reports... 2022-11-23T02:07:37.0800182Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020507.xml 2022-11-23T02:07:37.0800556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0800731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0801112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0801307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0801326Z 2022-11-23T02:07:37.0801435Z Running tests... 2022-11-23T02:07:37.0801698Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0801994Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0802257Z test_scatter (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0802279Z 2022-11-23T02:07:37.0802542Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0802655Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0802675Z 2022-11-23T02:07:37.0802782Z OK (skipped=1) 2022-11-23T02:07:37.0802801Z 2022-11-23T02:07:37.0802920Z Generating XML reports... 2022-11-23T02:07:37.0803369Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020516.xml 2022-11-23T02:07:37.0803822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0803999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0804368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0804567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0804586Z 2022-11-23T02:07:37.0804692Z Running tests... 2022-11-23T02:07:37.0804953Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0805264Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0805532Z test_scatter_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0805552Z 2022-11-23T02:07:37.0805813Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0805927Z Ran 1 test in 0.003s 2022-11-23T02:07:37.0805946Z 2022-11-23T02:07:37.0806052Z OK (skipped=1) 2022-11-23T02:07:37.0806072Z 2022-11-23T02:07:37.0806179Z Generating XML reports... 2022-11-23T02:07:37.0806631Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020518.xml 2022-11-23T02:07:37.0807051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0807233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0807615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0807808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0807828Z 2022-11-23T02:07:37.0807936Z Running tests... 2022-11-23T02:07:37.0808198Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0808498Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0808768Z test_scatter_complex (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0808788Z 2022-11-23T02:07:37.0809052Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0809161Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0809181Z 2022-11-23T02:07:37.0809289Z OK (skipped=1) 2022-11-23T02:07:37.0809307Z 2022-11-23T02:07:37.0809430Z Generating XML reports... 2022-11-23T02:07:37.0809875Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020521.xml 2022-11-23T02:07:37.0810246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0810420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0810791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0810982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0811000Z 2022-11-23T02:07:37.0811107Z Running tests... 2022-11-23T02:07:37.0811373Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0811687Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0811944Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T02:07:37.0811964Z 2022-11-23T02:07:37.0812225Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0812337Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0812356Z 2022-11-23T02:07:37.0812463Z OK (skipped=1) 2022-11-23T02:07:37.0812482Z 2022-11-23T02:07:37.0812644Z Generating XML reports... 2022-11-23T02:07:37.0813091Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020523.xml 2022-11-23T02:07:37.0813461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0813635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0814019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0814211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0814230Z 2022-11-23T02:07:37.0814341Z Running tests... 2022-11-23T02:07:37.0814605Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0814900Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0815166Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T02:07:37.0815189Z 2022-11-23T02:07:37.0815448Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0815558Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0815577Z 2022-11-23T02:07:37.0815685Z OK (skipped=1) 2022-11-23T02:07:37.0815704Z 2022-11-23T02:07:37.0815825Z Generating XML reports... 2022-11-23T02:07:37.0816331Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020526.xml 2022-11-23T02:07:37.0816713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0816888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0817252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0817443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0817469Z 2022-11-23T02:07:37.0817577Z Running tests... 2022-11-23T02:07:37.0817838Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0818146Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0818421Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0818441Z 2022-11-23T02:07:37.0818705Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0818813Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0818832Z 2022-11-23T02:07:37.0818938Z OK (skipped=1) 2022-11-23T02:07:37.0818957Z 2022-11-23T02:07:37.0819063Z Generating XML reports... 2022-11-23T02:07:37.0819510Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020528.xml 2022-11-23T02:07:37.0819884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0820063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0820442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0820636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0820656Z 2022-11-23T02:07:37.0820765Z Running tests... 2022-11-23T02:07:37.0821024Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0821315Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0821578Z test_scatter_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:07:37.0821597Z 2022-11-23T02:07:37.0821860Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0822033Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0822054Z 2022-11-23T02:07:37.0822162Z OK (skipped=1) 2022-11-23T02:07:37.0822181Z 2022-11-23T02:07:37.0822301Z Generating XML reports... 2022-11-23T02:07:37.0822749Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020530.xml 2022-11-23T02:07:37.0823125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0823304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0823668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0823861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0823880Z 2022-11-23T02:07:37.0823988Z Running tests... 2022-11-23T02:07:37.0824249Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0824565Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0824949Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:07:37.0824969Z 2022-11-23T02:07:37.0825231Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0825392Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0825412Z 2022-11-23T02:07:37.0825528Z OK (skipped=1) 2022-11-23T02:07:37.0825547Z 2022-11-23T02:07:37.0825652Z Generating XML reports... 2022-11-23T02:07:37.0826099Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020533.xml 2022-11-23T02:07:37.0826472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0826648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0827033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0827223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0827242Z 2022-11-23T02:07:37.0827350Z Running tests... 2022-11-23T02:07:37.0827613Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0827923Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0828154Z test_send_recv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0828375Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37377 2022-11-23T02:07:37.0828590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37378 2022-11-23T02:07:37.0828963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0829145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0829529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0829723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0830081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0830260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0830643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0830832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0831079Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0831324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0831774Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0832175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0832414Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0832645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0832924Z [1669169139.563451] [d8f8c46cdf70:37378:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0833164Z [1669169140.353029] [d8f8c46cdf70:37378:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0833410Z [1669169140.353029] [d8f8c46cdf70:37378:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0833695Z [1669169139.563477] [d8f8c46cdf70:37377:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0833928Z [1669169140.363372] [d8f8c46cdf70:37377:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0834217Z [1669169140.363372] [d8f8c46cdf70:37377:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0834313Z ok (5.476s) 2022-11-23T02:07:37.0834347Z 2022-11-23T02:07:37.0834602Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0834712Z Ran 1 test in 5.476s 2022-11-23T02:07:37.0834732Z 2022-11-23T02:07:37.0834823Z OK 2022-11-23T02:07:37.0834842Z 2022-11-23T02:07:37.0834965Z Generating XML reports... 2022-11-23T02:07:37.0835632Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020535.xml 2022-11-23T02:07:37.0836017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0836196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0836582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0836760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0836781Z 2022-11-23T02:07:37.0836891Z Running tests... 2022-11-23T02:07:37.0837156Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0837466Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0837751Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T02:07:37.0837775Z 2022-11-23T02:07:37.0838038Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0838152Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0838172Z 2022-11-23T02:07:37.0838282Z OK (skipped=1) 2022-11-23T02:07:37.0838300Z 2022-11-23T02:07:37.0838424Z Generating XML reports... 2022-11-23T02:07:37.0838860Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020543.xml 2022-11-23T02:07:37.0839238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0839418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0839801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0839995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0840095Z 2022-11-23T02:07:37.0840212Z Running tests... 2022-11-23T02:07:37.0840482Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0840797Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0841096Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T02:07:37.0841131Z 2022-11-23T02:07:37.0841377Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0841490Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0841510Z 2022-11-23T02:07:37.0841617Z OK (skipped=1) 2022-11-23T02:07:37.0841636Z 2022-11-23T02:07:37.0841756Z Generating XML reports... 2022-11-23T02:07:37.0842197Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020546.xml 2022-11-23T02:07:37.0842573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0842754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0843134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0843311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0843407Z 2022-11-23T02:07:37.0843509Z Running tests... 2022-11-23T02:07:37.0843776Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0844089Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0844389Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T02:07:37.0844409Z 2022-11-23T02:07:37.0844675Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0844790Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0844810Z 2022-11-23T02:07:37.0844919Z OK (skipped=1) 2022-11-23T02:07:37.0844938Z 2022-11-23T02:07:37.0845061Z Generating XML reports... 2022-11-23T02:07:37.0845494Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020548.xml 2022-11-23T02:07:37.0845877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0846058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0846439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0846631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0846650Z 2022-11-23T02:07:37.0846760Z Running tests... 2022-11-23T02:07:37.0847025Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0847341Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0847620Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0847827Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37586 2022-11-23T02:07:37.0848048Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37587 2022-11-23T02:07:37.0848426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0848604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0848988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0849180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0849546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0849777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0850147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0850339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0850592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0850995Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0851243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0851642Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0851872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0852102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0852447Z STAGE:2022-11-23 02:05:54 37586:37586 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0852804Z STAGE:2022-11-23 02:05:54 37587:37587 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0853098Z [1669169154.812326] [d8f8c46cdf70:37587:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0853339Z [1669169155.854623] [d8f8c46cdf70:37587:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0853587Z [1669169155.854623] [d8f8c46cdf70:37587:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0853935Z STAGE:2022-11-23 02:05:56 37587:37587 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0854291Z STAGE:2022-11-23 02:05:56 37587:37587 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0854569Z [1669169154.812366] [d8f8c46cdf70:37586:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0854805Z [1669169155.863602] [d8f8c46cdf70:37586:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0855046Z [1669169155.863602] [d8f8c46cdf70:37586:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0855387Z STAGE:2022-11-23 02:05:56 37586:37586 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0855722Z STAGE:2022-11-23 02:05:56 37586:37586 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0855828Z ok (5.886s) 2022-11-23T02:07:37.0855853Z 2022-11-23T02:07:37.0856120Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0856235Z Ran 1 test in 5.886s 2022-11-23T02:07:37.0856254Z 2022-11-23T02:07:37.0856346Z OK 2022-11-23T02:07:37.0856365Z 2022-11-23T02:07:37.0856488Z Generating XML reports... 2022-11-23T02:07:37.0856939Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020550.xml 2022-11-23T02:07:37.0857318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0857478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0857860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0858054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0858073Z 2022-11-23T02:07:37.0858239Z Running tests... 2022-11-23T02:07:37.0858508Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0858820Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0859057Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T02:07:37.0859076Z 2022-11-23T02:07:37.0859340Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0859455Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0859474Z 2022-11-23T02:07:37.0859565Z OK (skipped=1) 2022-11-23T02:07:37.0859584Z 2022-11-23T02:07:37.0859709Z Generating XML reports... 2022-11-23T02:07:37.0860162Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020559.xml 2022-11-23T02:07:37.0860537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0860721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0861100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0861293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0861312Z 2022-11-23T02:07:37.0861421Z Running tests... 2022-11-23T02:07:37.0861733Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0862039Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0862305Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T02:07:37.0862326Z 2022-11-23T02:07:37.0862587Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0862701Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0862720Z 2022-11-23T02:07:37.0862829Z OK (skipped=1) 2022-11-23T02:07:37.0862852Z 2022-11-23T02:07:37.0862980Z Generating XML reports... 2022-11-23T02:07:37.0863427Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020601.xml 2022-11-23T02:07:37.0863803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0863986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0864351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0864542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0864562Z 2022-11-23T02:07:37.0864672Z Running tests... 2022-11-23T02:07:37.0864935Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0865244Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0865514Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T02:07:37.0865533Z 2022-11-23T02:07:37.0865796Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0865906Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0865925Z 2022-11-23T02:07:37.0866016Z OK (skipped=1) 2022-11-23T02:07:37.0866046Z 2022-11-23T02:07:37.0866158Z Generating XML reports... 2022-11-23T02:07:37.0866603Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020604.xml 2022-11-23T02:07:37.0866976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0867151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0867535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0867783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0867803Z 2022-11-23T02:07:37.0867911Z Running tests... 2022-11-23T02:07:37.0868180Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0868475Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0868749Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0868969Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37799 2022-11-23T02:07:37.0869181Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37800 2022-11-23T02:07:37.0869558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0869735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0870119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0870317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0870669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0870892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0871282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0871471Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0871718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0871957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0872360Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0872769Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0873003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0873331Z STAGE:2022-11-23 02:06:10 37799:37799 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0873561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0873899Z STAGE:2022-11-23 02:06:10 37800:37800 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0874178Z [1669169170.620527] [d8f8c46cdf70:37800:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0874416Z [1669169171.696589] [d8f8c46cdf70:37800:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0874663Z [1669169171.696589] [d8f8c46cdf70:37800:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0875009Z STAGE:2022-11-23 02:06:12 37800:37800 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0875571Z STAGE:2022-11-23 02:06:12 37800:37800 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0875850Z [1669169170.618948] [d8f8c46cdf70:37799:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0876085Z [1669169171.678842] [d8f8c46cdf70:37799:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0876308Z [1669169171.678842] [d8f8c46cdf70:37799:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0876734Z STAGE:2022-11-23 02:06:12 37799:37799 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0877088Z STAGE:2022-11-23 02:06:12 37799:37799 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0877189Z ok (5.978s) 2022-11-23T02:07:37.0877209Z 2022-11-23T02:07:37.0877473Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0877590Z Ran 1 test in 5.978s 2022-11-23T02:07:37.0877611Z 2022-11-23T02:07:37.0877705Z OK 2022-11-23T02:07:37.0877725Z 2022-11-23T02:07:37.0877846Z Generating XML reports... 2022-11-23T02:07:37.0878294Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020606.xml 2022-11-23T02:07:37.0878658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0878837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0879228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0879424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0879443Z 2022-11-23T02:07:37.0879552Z Running tests... 2022-11-23T02:07:37.0879813Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0880199Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0880470Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0880672Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37913 2022-11-23T02:07:37.0880891Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37914 2022-11-23T02:07:37.0881303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0881484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0881869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0882061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0882432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0882608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0882992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0883167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0883416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0883663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0884072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0884472Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0884710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0884944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0885228Z [1669169179.118951] [d8f8c46cdf70:37913:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0885464Z [1669169179.915210] [d8f8c46cdf70:37913:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0885694Z [1669169179.915210] [d8f8c46cdf70:37913:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0886029Z [1669169179.139946] [d8f8c46cdf70:37914:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0886265Z [1669169179.932877] [d8f8c46cdf70:37914:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0886507Z [1669169179.932877] [d8f8c46cdf70:37914:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0886610Z ok (5.450s) 2022-11-23T02:07:37.0886631Z 2022-11-23T02:07:37.0886905Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0887021Z Ran 1 test in 5.451s 2022-11-23T02:07:37.0887041Z 2022-11-23T02:07:37.0887131Z OK 2022-11-23T02:07:37.0887150Z 2022-11-23T02:07:37.0887273Z Generating XML reports... 2022-11-23T02:07:37.0887703Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020615.xml 2022-11-23T02:07:37.0888089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0888266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0888700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0888902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0888921Z 2022-11-23T02:07:37.0889029Z Running tests... 2022-11-23T02:07:37.0889295Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0889607Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0889895Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0890106Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38023 2022-11-23T02:07:37.0890328Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38024 2022-11-23T02:07:37.0890701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0890882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0891265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0891457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0891821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0891997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0892362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0892558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0892808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0893056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0893459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0893859Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0894091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0894430Z STAGE:2022-11-23 02:06:27 38024:38024 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0894716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0895041Z STAGE:2022-11-23 02:06:27 38023:38023 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0895320Z [1669169187.143026] [d8f8c46cdf70:38023:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0895561Z [1669169188.182241] [d8f8c46cdf70:38023:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0895804Z [1669169188.182241] [d8f8c46cdf70:38023:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0896148Z STAGE:2022-11-23 02:06:28 38023:38023 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0896424Z [1669169187.145467] [d8f8c46cdf70:38024:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0896665Z [1669169188.190777] [d8f8c46cdf70:38024:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0896907Z [1669169188.190777] [d8f8c46cdf70:38024:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0897343Z STAGE:2022-11-23 02:06:28 38024:38024 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0897704Z STAGE:2022-11-23 02:06:28 38023:38023 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0898039Z STAGE:2022-11-23 02:06:28 38024:38024 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0898140Z ok (5.961s) 2022-11-23T02:07:37.0898160Z 2022-11-23T02:07:37.0898423Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0898536Z Ran 1 test in 5.961s 2022-11-23T02:07:37.0898555Z 2022-11-23T02:07:37.0898652Z OK 2022-11-23T02:07:37.0898671Z 2022-11-23T02:07:37.0898792Z Generating XML reports... 2022-11-23T02:07:37.0899243Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020623.xml 2022-11-23T02:07:37.0899618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0899783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0900169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0900364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0900383Z 2022-11-23T02:07:37.0900490Z Running tests... 2022-11-23T02:07:37.0900750Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0901064Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0901351Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0901573Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38137 2022-11-23T02:07:37.0901793Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38138 2022-11-23T02:07:37.0902157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0902337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0902707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0902887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0903264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0903516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0903900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0904090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0904323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0904573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0904977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0905371Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0905600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0905833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0906175Z STAGE:2022-11-23 02:06:35 38138:38138 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0906504Z STAGE:2022-11-23 02:06:35 38137:38137 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:07:37.0906828Z [1669169195.485749] [d8f8c46cdf70:38137:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0907071Z [1669169196.504178] [d8f8c46cdf70:38137:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0907298Z [1669169196.504178] [d8f8c46cdf70:38137:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0907642Z STAGE:2022-11-23 02:06:36 38137:38137 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0907926Z [1669169195.506357] [d8f8c46cdf70:38138:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0908161Z [1669169196.515080] [d8f8c46cdf70:38138:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0908410Z [1669169196.515080] [d8f8c46cdf70:38138:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0908752Z STAGE:2022-11-23 02:06:36 38138:38138 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:07:37.0909105Z STAGE:2022-11-23 02:06:36 38137:38137 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0909448Z STAGE:2022-11-23 02:06:36 38138:38138 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:07:37.0909551Z ok (5.804s) 2022-11-23T02:07:37.0909572Z 2022-11-23T02:07:37.0909821Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0909938Z Ran 1 test in 5.804s 2022-11-23T02:07:37.0909957Z 2022-11-23T02:07:37.0910054Z OK 2022-11-23T02:07:37.0910073Z 2022-11-23T02:07:37.0910194Z Generating XML reports... 2022-11-23T02:07:37.0910648Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020631.xml 2022-11-23T02:07:37.0911030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0911208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0911594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0911792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0911811Z 2022-11-23T02:07:37.0911903Z Running tests... 2022-11-23T02:07:37.0912165Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0912539Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0912823Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T02:07:37.0912844Z 2022-11-23T02:07:37.0913103Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0913217Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0913236Z 2022-11-23T02:07:37.0913343Z OK (skipped=1) 2022-11-23T02:07:37.0913362Z 2022-11-23T02:07:37.0913488Z Generating XML reports... 2022-11-23T02:07:37.0913919Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020640.xml 2022-11-23T02:07:37.0914299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0914476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0914865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0915291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0915313Z 2022-11-23T02:07:37.0915429Z Running tests... 2022-11-23T02:07:37.0915771Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0916099Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0916396Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T02:07:37.0916417Z 2022-11-23T02:07:37.0916678Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0916774Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0916794Z 2022-11-23T02:07:37.0916902Z OK (skipped=1) 2022-11-23T02:07:37.0916921Z 2022-11-23T02:07:37.0917049Z Generating XML reports... 2022-11-23T02:07:37.0917496Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020642.xml 2022-11-23T02:07:37.0917874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0918055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0918443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0918638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0918657Z 2022-11-23T02:07:37.0918749Z Running tests... 2022-11-23T02:07:37.0919012Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0919324Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0919589Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0919816Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38317 2022-11-23T02:07:37.0920033Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38318 2022-11-23T02:07:37.0920412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0920590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0920957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0921151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0921520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0921694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0922147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0922336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0922584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0922832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0923238Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0923623Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0923858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0924087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0924376Z [1669169209.522019] [d8f8c46cdf70:38318:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0924612Z [1669169209.528864] [d8f8c46cdf70:38318:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0924905Z [1669169209.528864] [d8f8c46cdf70:38318:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0925192Z [1669169209.521338] [d8f8c46cdf70:38317:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0925424Z [1669169209.526872] [d8f8c46cdf70:38317:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0925665Z [1669169209.526872] [d8f8c46cdf70:38317:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0925775Z ok (6.034s) 2022-11-23T02:07:37.0925796Z 2022-11-23T02:07:37.0926050Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0926164Z Ran 1 test in 6.034s 2022-11-23T02:07:37.0926183Z 2022-11-23T02:07:37.0926275Z OK 2022-11-23T02:07:37.0926294Z 2022-11-23T02:07:37.0926414Z Generating XML reports... 2022-11-23T02:07:37.0926864Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020644.xml 2022-11-23T02:07:37.0927238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0927414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0927796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0927973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0928013Z 2022-11-23T02:07:37.0928108Z Running tests... 2022-11-23T02:07:37.0928377Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0928684Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0928951Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0929170Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38435 2022-11-23T02:07:37.0929387Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38436 2022-11-23T02:07:37.0929760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0929940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0930308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0930563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0930936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0931110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0931489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0931678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0931930Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0932177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0932580Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0932969Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0933198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0933452Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0bmotvl4 2022-11-23T02:07:37.0933768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0bmotvl4/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0934006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0934258Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4_5but41 2022-11-23T02:07:37.0934523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4_5but41/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0934802Z [1669169217.409567] [d8f8c46cdf70:38436:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0935046Z [1669169218.189478] [d8f8c46cdf70:38436:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0935277Z [1669169218.189478] [d8f8c46cdf70:38436:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0935558Z [1669169217.387987] [d8f8c46cdf70:38435:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0935793Z [1669169218.172508] [d8f8c46cdf70:38435:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0936034Z [1669169218.172508] [d8f8c46cdf70:38435:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0936135Z ok (5.456s) 2022-11-23T02:07:37.0936155Z 2022-11-23T02:07:37.0936423Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0936542Z Ran 1 test in 5.456s 2022-11-23T02:07:37.0936561Z 2022-11-23T02:07:37.0936654Z OK 2022-11-23T02:07:37.0936674Z 2022-11-23T02:07:37.0936795Z Generating XML reports... 2022-11-23T02:07:37.0937232Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020653.xml 2022-11-23T02:07:37.0937607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0937785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0938171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0938366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0938385Z 2022-11-23T02:07:37.0938495Z Running tests... 2022-11-23T02:07:37.0938758Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0939132Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0939423Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl & Gloo backend support DistributedDataParallel (0.002s) 2022-11-23T02:07:37.0939458Z 2022-11-23T02:07:37.0939710Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0939822Z Ran 1 test in 0.002s 2022-11-23T02:07:37.0939841Z 2022-11-23T02:07:37.0939949Z OK (skipped=1) 2022-11-23T02:07:37.0939968Z 2022-11-23T02:07:37.0940088Z Generating XML reports... 2022-11-23T02:07:37.0940534Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020701.xml 2022-11-23T02:07:37.0940907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0941085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0941471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0941649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0941685Z 2022-11-23T02:07:37.0941777Z Running tests... 2022-11-23T02:07:37.0942100Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0942418Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0942710Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0942931Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38582 2022-11-23T02:07:37.0943147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38583 2022-11-23T02:07:37.0943519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0943700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0944068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0944260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0944630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0944805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0945178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0945375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0945623Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0945874Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0946280Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0946669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0946904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0947132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0947391Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyejl69ky 2022-11-23T02:07:37.0947664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyejl69ky/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0947919Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5h3shslv 2022-11-23T02:07:37.0948246Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5h3shslv/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0948527Z [1669169228.573492] [d8f8c46cdf70:38583:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0948772Z [1669169228.580159] [d8f8c46cdf70:38583:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0949001Z [1669169228.580159] [d8f8c46cdf70:38583:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0949777Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:37.0950058Z [1669169228.573096] [d8f8c46cdf70:38582:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0950337Z [1669169228.580186] [d8f8c46cdf70:38582:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0950582Z [1669169228.580186] [d8f8c46cdf70:38582:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0951364Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:07:37.0951473Z ok (5.975s) 2022-11-23T02:07:37.0951493Z 2022-11-23T02:07:37.0951762Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0951878Z Ran 1 test in 5.975s 2022-11-23T02:07:37.0951897Z 2022-11-23T02:07:37.0951994Z OK 2022-11-23T02:07:37.0952013Z 2022-11-23T02:07:37.0952138Z Generating XML reports... 2022-11-23T02:07:37.0952589Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020703.xml 2022-11-23T02:07:37.0952966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0953129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0953519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0953714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0953733Z 2022-11-23T02:07:37.0953842Z Running tests... 2022-11-23T02:07:37.0954108Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0954427Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0954716Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0954936Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38700 2022-11-23T02:07:37.0955343Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38701 2022-11-23T02:07:37.0955728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0955987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0956376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0956571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0956944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0957119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0957498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0957688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0957921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0958164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0958573Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0958974Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0959260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0959501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0959741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0959980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0960382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0960769Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0961014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:37.0961253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:37.0961650Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0962050Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0962309Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzybvmn4f 2022-11-23T02:07:37.0962583Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzybvmn4f/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0962841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpscidjmlh 2022-11-23T02:07:37.0963115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpscidjmlh/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0963381Z [1669169237.135580] [d8f8c46cdf70:38701:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0963624Z [1669169237.141538] [d8f8c46cdf70:38701:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0963868Z [1669169237.141538] [d8f8c46cdf70:38701:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0964267Z [1669169242.540027] [d8f8c46cdf70:38701:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x5583d26097c0 was not matched 2022-11-23T02:07:37.0964540Z [1669169237.126302] [d8f8c46cdf70:38700:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0964827Z [1669169237.132243] [d8f8c46cdf70:38700:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0965064Z [1669169237.132243] [d8f8c46cdf70:38700:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0965393Z [1669169242.502946] [d8f8c46cdf70:38700:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x5603e62fa3c0, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T02:07:37.0965678Z [1669169242.550003] [d8f8c46cdf70:38700:0] mpool.c:55 UCX WARN object 0x5603e630b800 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T02:07:37.0965782Z ok (10.637s) 2022-11-23T02:07:37.0965803Z 2022-11-23T02:07:37.0966075Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0966178Z Ran 1 test in 10.637s 2022-11-23T02:07:37.0966197Z 2022-11-23T02:07:37.0966292Z OK 2022-11-23T02:07:37.0966312Z 2022-11-23T02:07:37.0966436Z Generating XML reports... 2022-11-23T02:07:37.0966887Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020712.xml 2022-11-23T02:07:37.0967313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0967499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0967882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0968074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0968094Z 2022-11-23T02:07:37.0968187Z Running tests... 2022-11-23T02:07:37.0968451Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0968765Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:07:37.0969054Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:07:37.0969276Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38820 2022-11-23T02:07:37.0969499Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38821 2022-11-23T02:07:37.0969874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0970049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0970430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0970605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0970974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:07:37.0971154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:07:37.0971532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:07:37.0971718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:07:37.0971969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:07:37.0972261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:07:37.0972665Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0973049Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:07:37.0973340Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:07:37.0973567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:07:37.0973806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:07:37.0974051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:07:37.0974454Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0974850Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:07:37.0975096Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:07:37.0975327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:07:37.0975710Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0976098Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:07:37.0976357Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_c4btp1m 2022-11-23T02:07:37.0976679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_c4btp1m/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0976945Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppm2u77ue 2022-11-23T02:07:37.0977215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppm2u77ue/_remote_module_non_scriptable.py 2022-11-23T02:07:37.0977495Z [1669169250.284297] [d8f8c46cdf70:38820:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0977736Z [1669169250.290085] [d8f8c46cdf70:38820:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0977983Z [1669169250.290085] [d8f8c46cdf70:38820:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0978311Z [1669169255.654279] [d8f8c46cdf70:38820:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x56326f6ad1c0, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T02:07:37.0978599Z [1669169255.691340] [d8f8c46cdf70:38820:0] mpool.c:55 UCX WARN object 0x56326f7be6c0 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T02:07:37.0978862Z [1669169250.286447] [d8f8c46cdf70:38821:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:07:37.0979096Z [1669169250.291708] [d8f8c46cdf70:38821:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:07:37.0979342Z [1669169250.291708] [d8f8c46cdf70:38821:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:07:37.0979742Z [1669169255.701349] [d8f8c46cdf70:38821:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x55a894037540 was not matched 2022-11-23T02:07:37.0979851Z ok (10.611s) 2022-11-23T02:07:37.0979872Z 2022-11-23T02:07:37.0980141Z ---------------------------------------------------------------------- 2022-11-23T02:07:37.0980255Z Ran 1 test in 10.611s 2022-11-23T02:07:37.0980274Z 2022-11-23T02:07:37.0980367Z OK 2022-11-23T02:07:37.0980386Z 2022-11-23T02:07:37.0980510Z Generating XML reports... 2022-11-23T02:07:37.0980944Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020725.xml 2022-11-23T02:07:37.0980982Z 2022-11-23T02:07:37.0981443Z ##[endgroup] 2022-11-23T02:07:37.0981975Z FINISHED PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_mqf8_nzl) 2022-11-23T02:07:37.0981996Z 2022-11-23T02:07:37.0982276Z Running distributed/pipeline/sync/test_worker ... [2022-11-23 02:07:36.866817] 2022-11-23T02:07:37.0982679Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_worker.py', '-v'] ... [2022-11-23 02:07:36.867098] 2022-11-23T02:07:39.8447477Z 2022-11-23T02:07:39.8448296Z Expand the folded group to see the log file of distributed/pipeline/sync/test_worker 2022-11-23T02:07:39.8450164Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_worker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_worker_3cyy1sro) 2022-11-23T02:07:39.8450714Z ============================= test session starts ============================== 2022-11-23T02:07:39.8451348Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:07:39.8451732Z cachedir: .pytest_cache 2022-11-23T02:07:39.8452290Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:07:39.8452735Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:07:39.8453072Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:07:39.8453852Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:07:39.8454281Z collecting ... collected 6 items 2022-11-23T02:07:39.8455154Z Running 6 items in this shard: test/distributed/pipeline/sync/test_worker.py::test_compute_multithreading, test/distributed/pipeline/sync/test_worker.py::test_compute_success, test/distributed/pipeline/sync/test_worker.py::test_compute_exception, test/distributed/pipeline/sync/test_worker.py::test_grad_mode[True], test/distributed/pipeline/sync/test_worker.py::test_grad_mode[False], test/distributed/pipeline/sync/test_worker.py::test_worker_per_device 2022-11-23T02:07:39.8455865Z 2022-11-23T02:07:39.8456107Z distributed/pipeline/sync/test_worker.py::test_compute_multithreading PASSED [ 16%] 2022-11-23T02:07:39.8456563Z distributed/pipeline/sync/test_worker.py::test_compute_success PASSED [ 33%] 2022-11-23T02:07:39.8456993Z distributed/pipeline/sync/test_worker.py::test_compute_exception PASSED [ 50%] 2022-11-23T02:07:39.8457433Z distributed/pipeline/sync/test_worker.py::test_grad_mode[True] PASSED [ 66%] 2022-11-23T02:07:39.8457871Z distributed/pipeline/sync/test_worker.py::test_grad_mode[False] PASSED [ 83%] 2022-11-23T02:07:39.8458312Z distributed/pipeline/sync/test_worker.py::test_worker_per_device PASSED [100%] 2022-11-23T02:07:39.8458547Z 2022-11-23T02:07:39.8458711Z ============================== 6 passed in 0.07s =============================== 2022-11-23T02:07:39.8458906Z 2022-11-23T02:07:39.8459222Z ##[endgroup] 2022-11-23T02:07:39.8459868Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_worker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_worker_3cyy1sro) 2022-11-23T02:07:39.8460234Z 2022-11-23T02:07:39.8460538Z Running distributed/pipeline/sync/test_pipeline ... [2022-11-23 02:07:39.844854] 2022-11-23T02:07:39.8461163Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_pipeline.py', '-v'] ... [2022-11-23 02:07:39.845170] 2022-11-23T02:07:42.2462455Z 2022-11-23T02:07:42.2463084Z Expand the folded group to see the log file of distributed/pipeline/sync/test_pipeline 2022-11-23T02:07:42.2464101Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_pipeline (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_pipeline_vd8zxv21) 2022-11-23T02:07:42.2464896Z ============================= test session starts ============================== 2022-11-23T02:07:42.2465525Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:07:42.2465873Z cachedir: .pytest_cache 2022-11-23T02:07:42.2467012Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:07:42.2467472Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:07:42.2467790Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:07:42.2468623Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:07:42.2469036Z collecting ... collected 1 item 2022-11-23T02:07:42.2469449Z Running 1 items in this shard: test/distributed/pipeline/sync/test_pipeline.py::test_clock_cycles 2022-11-23T02:07:42.2469706Z 2022-11-23T02:07:42.2469926Z distributed/pipeline/sync/test_pipeline.py::test_clock_cycles PASSED [100%] 2022-11-23T02:07:42.2470181Z 2022-11-23T02:07:42.2470343Z ============================== 1 passed in 0.03s =============================== 2022-11-23T02:07:42.2470543Z 2022-11-23T02:07:42.2470862Z ##[endgroup] 2022-11-23T02:07:42.2471498Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_pipeline (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_pipeline_vd8zxv21) 2022-11-23T02:07:42.2471898Z 2022-11-23T02:07:42.2472209Z Running distributed/pipeline/sync/test_microbatch ... [2022-11-23 02:07:42.246357] 2022-11-23T02:07:42.2472972Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_microbatch.py', '-v'] ... [2022-11-23 02:07:42.246719] 2022-11-23T02:07:44.7212181Z 2022-11-23T02:07:44.7212912Z Expand the folded group to see the log file of distributed/pipeline/sync/test_microbatch 2022-11-23T02:07:44.7213909Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_microbatch (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_microbatch_7oi_hc03) 2022-11-23T02:07:44.7214469Z ============================= test session starts ============================== 2022-11-23T02:07:44.7215070Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:07:44.7215477Z cachedir: .pytest_cache 2022-11-23T02:07:44.7216048Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:07:44.7216491Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:07:44.7216807Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:07:44.7217396Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:07:44.7217805Z collecting ... collected 10 items 2022-11-23T02:07:44.7219066Z Running 10 items in this shard: test/distributed/pipeline/sync/test_microbatch.py::test_batch_atomic, test/distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic, test/distributed/pipeline/sync/test_microbatch.py::test_batch_call, test/distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index, test/distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice, test/distributed/pipeline/sync/test_microbatch.py::test_check, test/distributed/pipeline/sync/test_microbatch.py::test_gather_tensors, test/distributed/pipeline/sync/test_microbatch.py::test_gather_tuples, test/distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor, test/distributed/pipeline/sync/test_microbatch.py::test_scatter_multiple_tensors 2022-11-23T02:07:44.7220155Z 2022-11-23T02:07:44.7220384Z distributed/pipeline/sync/test_microbatch.py::test_batch_atomic PASSED [ 10%] 2022-11-23T02:07:44.7220843Z distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic PASSED [ 20%] 2022-11-23T02:07:44.7221275Z distributed/pipeline/sync/test_microbatch.py::test_batch_call PASSED [ 30%] 2022-11-23T02:07:44.7221730Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index PASSED [ 40%] 2022-11-23T02:07:44.7222199Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice PASSED [ 50%] 2022-11-23T02:07:44.7222634Z distributed/pipeline/sync/test_microbatch.py::test_check PASSED [ 60%] 2022-11-23T02:07:44.7223341Z distributed/pipeline/sync/test_microbatch.py::test_gather_tensors PASSED [ 70%] 2022-11-23T02:07:44.7223779Z distributed/pipeline/sync/test_microbatch.py::test_gather_tuples PASSED [ 80%] 2022-11-23T02:07:44.7224217Z distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor PASSED [ 90%] 2022-11-23T02:07:44.7224667Z distributed/pipeline/sync/test_microbatch.py::test_scatter_multiple_tensors PASSED [100%] 2022-11-23T02:07:44.7224939Z 2022-11-23T02:07:44.7225101Z ============================== 10 passed in 0.08s ============================== 2022-11-23T02:07:44.7225299Z 2022-11-23T02:07:44.7225612Z ##[endgroup] 2022-11-23T02:07:44.7226258Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_microbatch (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_microbatch_7oi_hc03) 2022-11-23T02:07:44.7226646Z 2022-11-23T02:07:44.7226969Z Running distributed/pipeline/sync/test_deferred_batch_norm ... [2022-11-23 02:07:44.721310] 2022-11-23T02:07:44.7227629Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_deferred_batch_norm.py', '-v'] ... [2022-11-23 02:07:44.721572] 2022-11-23T02:07:47.7240183Z 2022-11-23T02:07:47.7241418Z Expand the folded group to see the log file of distributed/pipeline/sync/test_deferred_batch_norm 2022-11-23T02:07:47.7243507Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_deferred_batch_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_deferred_batch_norm_cfrg4guk) 2022-11-23T02:07:47.7244555Z ============================= test session starts ============================== 2022-11-23T02:07:47.7245627Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:07:47.7246287Z cachedir: .pytest_cache 2022-11-23T02:07:47.7247392Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:07:47.7248204Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:07:47.7248796Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:07:47.7249448Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:07:47.7249838Z collecting ... collected 11 items 2022-11-23T02:07:47.7252511Z Running 11 items in this shard: test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-1], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-4], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-1], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-4], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[0.1], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[None], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_convert_deferred_batch_norm, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_eval, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_optimize, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_conv_bn, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_input_requiring_grad 2022-11-23T02:07:47.7253830Z 2022-11-23T02:07:47.7254176Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-1] PASSED [ 9%] 2022-11-23T02:07:47.7254773Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-4] PASSED [ 18%] 2022-11-23T02:07:47.7255364Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-1] PASSED [ 27%] 2022-11-23T02:07:47.7255954Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-4] PASSED [ 36%] 2022-11-23T02:07:47.7256423Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[0.1] PASSED [ 45%] 2022-11-23T02:07:47.7256901Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[None] PASSED [ 54%] 2022-11-23T02:07:47.7257583Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_convert_deferred_batch_norm PASSED [ 63%] 2022-11-23T02:07:47.7258061Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_eval PASSED [ 72%] 2022-11-23T02:07:47.7258502Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_optimize PASSED [ 81%] 2022-11-23T02:07:47.7258964Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_conv_bn PASSED [ 90%] 2022-11-23T02:07:47.7259436Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_input_requiring_grad PASSED [100%] 2022-11-23T02:07:47.7259709Z 2022-11-23T02:07:47.7259856Z ============================== 11 passed in 0.62s ============================== 2022-11-23T02:07:47.7260056Z 2022-11-23T02:07:47.7260406Z ##[endgroup] 2022-11-23T02:07:47.7261108Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_deferred_batch_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_deferred_batch_norm_cfrg4guk) 2022-11-23T02:07:47.7261532Z 2022-11-23T02:07:47.7261835Z Running distributed/pipeline/sync/test_bugs ... [2022-11-23 02:07:47.724268] 2022-11-23T02:07:47.7262432Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_bugs.py', '-v'] ... [2022-11-23 02:07:47.724624] 2022-11-23T02:07:54.7840432Z 2022-11-23T02:07:54.7841457Z Expand the folded group to see the log file of distributed/pipeline/sync/test_bugs 2022-11-23T02:07:54.7842440Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_bugs (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_bugs_fclysae_) 2022-11-23T02:07:54.7842976Z ============================= test session starts ============================== 2022-11-23T02:07:54.7843584Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:07:54.7843971Z cachedir: .pytest_cache 2022-11-23T02:07:54.7844533Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:07:54.7844996Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:07:54.7845334Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:07:54.7845898Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:07:54.7846318Z collecting ... collected 4 items 2022-11-23T02:07:54.7846994Z Running 4 items in this shard: test/distributed/pipeline/sync/test_bugs.py::test_python_autograd_function, test/distributed/pipeline/sync/test_bugs.py::test_exception_no_hang, test/distributed/pipeline/sync/test_bugs.py::test_tuple_wait, test/distributed/pipeline/sync/test_bugs.py::test_parallel_randoms 2022-11-23T02:07:54.7847524Z 2022-11-23T02:07:54.7847753Z distributed/pipeline/sync/test_bugs.py::test_python_autograd_function PASSED [ 25%] 2022-11-23T02:07:54.7848212Z distributed/pipeline/sync/test_bugs.py::test_exception_no_hang PASSED [ 50%] 2022-11-23T02:07:54.7848633Z distributed/pipeline/sync/test_bugs.py::test_tuple_wait PASSED [ 75%] 2022-11-23T02:07:54.7849063Z distributed/pipeline/sync/test_bugs.py::test_parallel_randoms PASSED [100%] 2022-11-23T02:07:54.7849318Z 2022-11-23T02:07:54.7849482Z ============================== 4 passed in 4.58s =============================== 2022-11-23T02:07:54.7849679Z 2022-11-23T02:07:54.7849971Z ##[endgroup] 2022-11-23T02:07:54.7850605Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_bugs (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_bugs_fclysae_) 2022-11-23T02:07:54.7850980Z 2022-11-23T02:07:54.7851277Z Running distributed/pipeline/sync/skip/test_tracker ... [2022-11-23 02:07:54.784028] 2022-11-23T02:07:54.7851924Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_tracker.py', '-v'] ... [2022-11-23 02:07:54.784292] 2022-11-23T02:07:58.6611224Z 2022-11-23T02:07:58.6611911Z Expand the folded group to see the log file of distributed/pipeline/sync/skip/test_tracker 2022-11-23T02:07:58.6613219Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/skip/test_tracker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_tracker_srirbxc3) 2022-11-23T02:07:58.6613787Z ============================= test session starts ============================== 2022-11-23T02:07:58.6614553Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:07:58.6614922Z cachedir: .pytest_cache 2022-11-23T02:07:58.6615503Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:07:58.6615951Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:07:58.6616286Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:07:58.6616849Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:07:58.6617256Z collecting ... collected 6 items 2022-11-23T02:07:58.6618362Z Running 6 items in this shard: test/distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker, test/distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker_by_data_parallel, test/distributed/pipeline/sync/skip/test_tracker.py::test_reuse_portal, test/distributed/pipeline/sync/skip/test_tracker.py::test_no_copy_no_portal, test/distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_without_checkpointing, test/distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_with_checkpointing 2022-11-23T02:07:58.6619191Z 2022-11-23T02:07:58.6619433Z distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker PASSED [ 16%] 2022-11-23T02:07:58.6619939Z distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker_by_data_parallel PASSED [ 33%] 2022-11-23T02:07:58.6620411Z distributed/pipeline/sync/skip/test_tracker.py::test_reuse_portal PASSED [ 50%] 2022-11-23T02:07:58.6620873Z distributed/pipeline/sync/skip/test_tracker.py::test_no_copy_no_portal PASSED [ 66%] 2022-11-23T02:07:58.6621374Z distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_without_checkpointing PASSED [ 83%] 2022-11-23T02:07:58.6621888Z distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_with_checkpointing PASSED [100%] 2022-11-23T02:07:58.6622167Z 2022-11-23T02:07:58.6622311Z ============================== 6 passed in 1.40s =============================== 2022-11-23T02:07:58.6622514Z 2022-11-23T02:07:58.6622826Z ##[endgroup] 2022-11-23T02:07:58.6623508Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/skip/test_tracker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_tracker_srirbxc3) 2022-11-23T02:07:58.6623909Z 2022-11-23T02:07:58.6624183Z Running distributed/pipeline/sync/skip/test_leak ... [2022-11-23 02:07:58.661162] 2022-11-23T02:07:58.6624814Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_leak.py', '-v'] ... [2022-11-23 02:07:58.661437] 2022-11-23T02:08:01.3506005Z 2022-11-23T02:08:01.3506600Z Expand the folded group to see the log file of distributed/pipeline/sync/skip/test_leak 2022-11-23T02:08:01.3507904Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/skip/test_leak (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_leak_dbb_yhn9) 2022-11-23T02:08:01.3508482Z ============================= test session starts ============================== 2022-11-23T02:08:01.3509111Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:08:01.3509477Z cachedir: .pytest_cache 2022-11-23T02:08:01.3510033Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:08:01.3510480Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:08:01.3510814Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:08:01.3511373Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:08:01.3512057Z collecting ... collected 8 items 2022-11-23T02:08:01.3513836Z Running 8 items in this shard: test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval], test/distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train], test/distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] 2022-11-23T02:08:01.3514899Z 2022-11-23T02:08:01.3515938Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train] PASSED [ 12%] 2022-11-23T02:08:01.3516569Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval] PASSED [ 25%] 2022-11-23T02:08:01.3517184Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train] PASSED [ 37%] 2022-11-23T02:08:01.3517785Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval] PASSED [ 50%] 2022-11-23T02:08:01.3518502Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train] PASSED [ 62%] 2022-11-23T02:08:01.3519113Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval] PASSED [ 75%] 2022-11-23T02:08:01.3519588Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train] PASSED [ 87%] 2022-11-23T02:08:01.3520096Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] PASSED [100%] 2022-11-23T02:08:01.3520367Z 2022-11-23T02:08:01.3520530Z ============================== 8 passed in 0.28s =============================== 2022-11-23T02:08:01.3520736Z 2022-11-23T02:08:01.3522815Z ##[endgroup] 2022-11-23T02:08:01.3523498Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/skip/test_leak (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_leak_dbb_yhn9) 2022-11-23T02:08:01.3523897Z 2022-11-23T02:08:01.3524187Z Running distributed/pipeline/sync/skip/test_api ... [2022-11-23 02:08:01.350712] 2022-11-23T02:08:01.3524818Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_api.py', '-v'] ... [2022-11-23 02:08:01.350981] 2022-11-23T02:08:03.7683299Z 2022-11-23T02:08:03.7684049Z Expand the folded group to see the log file of distributed/pipeline/sync/skip/test_api 2022-11-23T02:08:03.7685417Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/skip/test_api (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_api_9nmjccla) 2022-11-23T02:08:03.7686058Z ============================= test session starts ============================== 2022-11-23T02:08:03.7686700Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:08:03.7687066Z cachedir: .pytest_cache 2022-11-23T02:08:03.7687779Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:08:03.7688572Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:08:03.7689135Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:08:03.7690215Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:08:03.7690989Z collecting ... collected 3 items 2022-11-23T02:08:03.7692050Z Running 3 items in this shard: test/distributed/pipeline/sync/skip/test_api.py::test_namespace_difference, test/distributed/pipeline/sync/skip/test_api.py::test_namespace_copy, test/distributed/pipeline/sync/skip/test_api.py::test_skippable_repr 2022-11-23T02:08:03.7692871Z 2022-11-23T02:08:03.7693608Z distributed/pipeline/sync/skip/test_api.py::test_namespace_difference PASSED [ 33%] 2022-11-23T02:08:03.7694385Z distributed/pipeline/sync/skip/test_api.py::test_namespace_copy PASSED [ 66%] 2022-11-23T02:08:03.7695111Z distributed/pipeline/sync/skip/test_api.py::test_skippable_repr PASSED [100%] 2022-11-23T02:08:03.7695527Z 2022-11-23T02:08:03.7695786Z ============================== 3 passed in 0.05s =============================== 2022-11-23T02:08:03.7696092Z 2022-11-23T02:08:03.7696601Z ##[endgroup] 2022-11-23T02:08:03.7697645Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/skip/test_api (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_api_9nmjccla) 2022-11-23T02:08:03.7698378Z 2022-11-23T02:08:03.7698711Z Running distributed/elastic/timer/api_test ... [2022-11-23 02:08:03.768452] 2022-11-23T02:08:03.7699411Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/timer/api_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:03.768816] 2022-11-23T02:08:05.6309322Z 2022-11-23T02:08:05.6310089Z Expand the folded group to see the log file of distributed/elastic/timer/api_test 2022-11-23T02:08:05.6311160Z ##[group]PRINTING LOG FILE of distributed/elastic/timer/api_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-timer-api_test_040yvqnh) 2022-11-23T02:08:05.6311539Z 2022-11-23T02:08:05.6311836Z ##[endgroup] 2022-11-23T02:08:05.6312776Z FINISHED PRINTING LOG FILE of distributed/elastic/timer/api_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-timer-api_test_040yvqnh) 2022-11-23T02:08:05.6313160Z 2022-11-23T02:08:05.6313924Z Running distributed/checkpoint/test_dedup_tensors ... [2022-11-23 02:08:05.631059] 2022-11-23T02:08:05.6317497Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/checkpoint/test_dedup_tensors.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:05.631355] 2022-11-23T02:08:09.6058354Z 2022-11-23T02:08:09.6058877Z Expand the folded group to see the log file of distributed/checkpoint/test_dedup_tensors 2022-11-23T02:08:09.6059993Z ##[group]PRINTING LOG FILE of distributed/checkpoint/test_dedup_tensors (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_dedup_tensors_uly2g_b9) 2022-11-23T02:08:09.6060566Z 2022-11-23T02:08:09.6060685Z Running tests... 2022-11-23T02:08:09.6061183Z ---------------------------------------------------------------------- 2022-11-23T02:08:09.6061775Z Test results will be stored in test-reports/python-unittest/distributed.checkpoint.test_dedup_tensors 2022-11-23T02:08:09.6062258Z test_dedup_shards (__main__.TestDedupTensor) ... ok (1.661s) 2022-11-23T02:08:09.6062481Z 2022-11-23T02:08:09.6062729Z ---------------------------------------------------------------------- 2022-11-23T02:08:09.6063065Z Ran 1 test in 1.661s 2022-11-23T02:08:09.6063227Z 2022-11-23T02:08:09.6063321Z OK 2022-11-23T02:08:09.6063456Z 2022-11-23T02:08:09.6063581Z Generating XML reports... 2022-11-23T02:08:09.6064203Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_dedup_tensors/TEST-TestDedupTensor-20221123020807.xml 2022-11-23T02:08:09.6064567Z 2022-11-23T02:08:09.6064884Z ##[endgroup] 2022-11-23T02:08:09.6065531Z FINISHED PRINTING LOG FILE of distributed/checkpoint/test_dedup_tensors (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_dedup_tensors_uly2g_b9) 2022-11-23T02:08:09.6065922Z 2022-11-23T02:08:09.6066222Z Running distributed/_shard/sharded_tensor/ops/test_math_ops ... [2022-11-23 02:08:09.605888] 2022-11-23T02:08:09.6066965Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_math_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:09.606214] 2022-11-23T02:08:11.7002340Z 2022-11-23T02:08:11.7003244Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T02:08:11.7004250Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_math_ops_l45q5oj2) 2022-11-23T02:08:11.7004938Z 2022-11-23T02:08:11.7005248Z ##[endgroup] 2022-11-23T02:08:11.7006059Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_math_ops_l45q5oj2) 2022-11-23T02:08:11.7006475Z 2022-11-23T02:08:11.7006765Z Running distributed/_composable/test_checkpoint ... [2022-11-23 02:08:11.700325] 2022-11-23T02:08:11.7009331Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_composable/test_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:11.700623] 2022-11-23T02:08:16.0878387Z 2022-11-23T02:08:16.0878863Z Expand the folded group to see the log file of distributed/_composable/test_checkpoint 2022-11-23T02:08:16.0879879Z ##[group]PRINTING LOG FILE of distributed/_composable/test_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-_composable-test_checkpoint_by87vg_a) 2022-11-23T02:08:16.0880286Z 2022-11-23T02:08:16.0880399Z Running tests... 2022-11-23T02:08:16.0880922Z ---------------------------------------------------------------------- 2022-11-23T02:08:16.0881486Z Test results will be stored in test-reports/python-unittest/distributed._composable.test_checkpoint 2022-11-23T02:08:16.0882196Z test_tensor_only_cpu (__main__.TestCheckpoint) ... ok (0.023s) 2022-11-23T02:08:16.0882651Z test_tensor_only_gpu (__main__.TestCheckpoint) ... ok (0.407s) 2022-11-23T02:08:16.0882865Z 2022-11-23T02:08:16.0883165Z ---------------------------------------------------------------------- 2022-11-23T02:08:16.0883482Z Ran 2 tests in 0.430s 2022-11-23T02:08:16.0883648Z 2022-11-23T02:08:16.0883746Z OK 2022-11-23T02:08:16.0883920Z 2022-11-23T02:08:16.0884050Z Generating XML reports... 2022-11-23T02:08:16.0884643Z Generated XML report: test-reports/python-unittest/distributed._composable.test_checkpoint/TEST-TestCheckpoint-20221123020815.xml 2022-11-23T02:08:16.0885006Z 2022-11-23T02:08:16.0885327Z ##[endgroup] 2022-11-23T02:08:16.0885960Z FINISHED PRINTING LOG FILE of distributed/_composable/test_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-_composable-test_checkpoint_by87vg_a) 2022-11-23T02:08:16.0886329Z 2022-11-23T02:08:16.0886577Z Running distributed/test_launcher ... [2022-11-23 02:08:16.087898] 2022-11-23T02:08:16.0887252Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_launcher.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:16.088290] 2022-11-23T02:08:20.5474052Z 2022-11-23T02:08:20.5474803Z Expand the folded group to see the log file of distributed/test_launcher 2022-11-23T02:08:20.5476352Z ##[group]PRINTING LOG FILE of distributed/test_launcher (/var/lib/jenkins/workspace/test/test-reports/distributed-test_launcher_y_hefdob) 2022-11-23T02:08:20.5476851Z 2022-11-23T02:08:20.5477063Z Running tests... 2022-11-23T02:08:20.5477900Z ---------------------------------------------------------------------- 2022-11-23T02:08:20.5478431Z Test results will be stored in test-reports/python-unittest/distributed.test_launcher 2022-11-23T02:08:20.5479600Z test_launch_user_script (__main__.TestDistributedLaunch) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79488 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.670s) 2022-11-23T02:08:20.5480555Z 2022-11-23T02:08:20.5480851Z ---------------------------------------------------------------------- 2022-11-23T02:08:20.5481190Z Ran 1 test in 1.671s 2022-11-23T02:08:20.5481357Z 2022-11-23T02:08:20.5481470Z OK (skipped=1) 2022-11-23T02:08:20.5481629Z 2022-11-23T02:08:20.5481738Z Generating XML reports... 2022-11-23T02:08:20.5482342Z Generated XML report: test-reports/python-unittest/distributed.test_launcher/TEST-TestDistributedLaunch-20221123020818.xml 2022-11-23T02:08:20.5483002Z 2022-11-23T02:08:20.5483333Z ##[endgroup] 2022-11-23T02:08:20.5483888Z FINISHED PRINTING LOG FILE of distributed/test_launcher (/var/lib/jenkins/workspace/test/test-reports/distributed-test_launcher_y_hefdob) 2022-11-23T02:08:20.5484227Z 2022-11-23T02:08:20.5484527Z Running distributed/elastic/metrics/api_test ... [2022-11-23 02:08:20.547549] 2022-11-23T02:08:20.5485224Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/metrics/api_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:20.547839] 2022-11-23T02:08:24.5006558Z 2022-11-23T02:08:24.5007291Z Expand the folded group to see the log file of distributed/elastic/metrics/api_test 2022-11-23T02:08:24.5008241Z ##[group]PRINTING LOG FILE of distributed/elastic/metrics/api_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-metrics-api_test_6c4e494a) 2022-11-23T02:08:24.5008624Z 2022-11-23T02:08:24.5008753Z Running tests... 2022-11-23T02:08:24.5009330Z ---------------------------------------------------------------------- 2022-11-23T02:08:24.5010208Z Test results will be stored in test-reports/python-unittest/distributed.elastic.metrics.api_test 2022-11-23T02:08:24.5010676Z test_get_metric_name (__main__.MetricsApiTest) ... ok (1.664s) 2022-11-23T02:08:24.5011281Z test_inheritance (__main__.MetricsApiTest) ... ok (0.002s) 2022-11-23T02:08:24.5011651Z test_profile (__main__.MetricsApiTest) ... ok (0.002s) 2022-11-23T02:08:24.5011863Z 2022-11-23T02:08:24.5012139Z ---------------------------------------------------------------------- 2022-11-23T02:08:24.5012659Z Ran 3 tests in 1.668s 2022-11-23T02:08:24.5012971Z 2022-11-23T02:08:24.5013073Z OK 2022-11-23T02:08:24.5013197Z 2022-11-23T02:08:24.5013322Z Generating XML reports... 2022-11-23T02:08:24.5013952Z Generated XML report: test-reports/python-unittest/distributed.elastic.metrics.api_test/TEST-MetricsApiTest-20221123020822.xml 2022-11-23T02:08:24.5014319Z 2022-11-23T02:08:24.5014630Z ##[endgroup] 2022-11-23T02:08:24.5015241Z FINISHED PRINTING LOG FILE of distributed/elastic/metrics/api_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-metrics-api_test_6c4e494a) 2022-11-23T02:08:24.5015614Z 2022-11-23T02:08:24.5015936Z Running distributed/_shard/sharded_optim/test_sharded_optim ... [2022-11-23 02:08:24.500772] 2022-11-23T02:08:24.5016679Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_optim/test_sharded_optim.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:24.501096] 2022-11-23T02:08:30.8098633Z 2022-11-23T02:08:30.8099349Z Expand the folded group to see the log file of distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T02:08:30.8100375Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_optim/test_sharded_optim (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_optim-test_sharded_optim_z359hykh) 2022-11-23T02:08:30.8100825Z 2022-11-23T02:08:30.8100940Z Running tests... 2022-11-23T02:08:30.8101458Z ---------------------------------------------------------------------- 2022-11-23T02:08:30.8102057Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim 2022-11-23T02:08:30.8102647Z test_named_params_with_sharded_tensor (__main__.TestShardedOptimizer) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:08:30.8103712Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82023 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.599s) 2022-11-23T02:08:30.8104521Z test_sharded_optim (__main__.TestShardedOptimizer) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39992 2022-11-23T02:08:30.8105034Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39993 2022-11-23T02:08:30.8105785Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39994 2022-11-23T02:08:30.8106235Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 39995 2022-11-23T02:08:30.8106854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:30.8107339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:30.8107920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:30.8108397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:30.8108971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:30.8109417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:30.8109994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:30.8110464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:30.8111025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:30.8111472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:30.8112160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:30.8112635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:30.8113198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:30.8113647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:30.8114219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:30.8114673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:30.8115481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:08:30.8116368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:08:30.8117211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:08:30.8117678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:08:30.8118071Z skip: Need at least 4 CUDA devices (2.413s) 2022-11-23T02:08:30.8118266Z 2022-11-23T02:08:30.8118553Z ---------------------------------------------------------------------- 2022-11-23T02:08:30.8118872Z Ran 2 tests in 4.013s 2022-11-23T02:08:30.8119036Z 2022-11-23T02:08:30.8119146Z OK (skipped=2) 2022-11-23T02:08:30.8119299Z 2022-11-23T02:08:30.8119423Z Generating XML reports... 2022-11-23T02:08:30.8120073Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20221123020826.xml 2022-11-23T02:08:30.8120456Z 2022-11-23T02:08:30.8120762Z ##[endgroup] 2022-11-23T02:08:30.8121437Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_optim/test_sharded_optim (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_optim-test_sharded_optim_z359hykh) 2022-11-23T02:08:30.8121835Z 2022-11-23T02:08:30.8122162Z Running distributed/_shard/sharded_tensor/test_megatron_prototype ... [2022-11-23 02:08:30.809947] 2022-11-23T02:08:30.8122911Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_megatron_prototype.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:30.810334] 2022-11-23T02:08:37.1565794Z 2022-11-23T02:08:37.1566763Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/test_megatron_prototype 2022-11-23T02:08:37.1568881Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_megatron_prototype (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_megatron_prototype_giqln3ko) 2022-11-23T02:08:37.1569327Z 2022-11-23T02:08:37.1569444Z Running tests... 2022-11-23T02:08:37.1569978Z ---------------------------------------------------------------------- 2022-11-23T02:08:37.1570614Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype 2022-11-23T02:08:37.1571208Z test_megatron_two_layer_prototype (__main__.TestShardedTensorMegatronLinear) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:08:37.1571734Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40163 2022-11-23T02:08:37.1572174Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40164 2022-11-23T02:08:37.1572618Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40165 2022-11-23T02:08:37.1573074Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40166 2022-11-23T02:08:37.1573710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:37.1574152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:37.1574869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:37.1575361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:37.1575933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:37.1576385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:37.1576962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:37.1577438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:37.1578005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:37.1578457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:37.1579039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:37.1579506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:37.1580070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:37.1580516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:37.1581090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:37.1581538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:37.1581984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:08:37.1582460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:08:37.1582933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:08:37.1583441Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:08:37.1583834Z skip: Need at least 4 CUDA devices (4.019s) 2022-11-23T02:08:37.1584028Z 2022-11-23T02:08:37.1584305Z ---------------------------------------------------------------------- 2022-11-23T02:08:37.1584618Z Ran 1 test in 4.019s 2022-11-23T02:08:37.1584780Z 2022-11-23T02:08:37.1584891Z OK (skipped=1) 2022-11-23T02:08:37.1585046Z 2022-11-23T02:08:37.1585172Z Generating XML reports... 2022-11-23T02:08:37.1585880Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype/TEST-TestShardedTensorMegatronLinear-20221123020832.xml 2022-11-23T02:08:37.1586371Z 2022-11-23T02:08:37.1586690Z ##[endgroup] 2022-11-23T02:08:37.1587390Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/test_megatron_prototype (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-test_megatron_prototype_giqln3ko) 2022-11-23T02:08:37.1587808Z 2022-11-23T02:08:37.1588153Z Running distributed/_tensor/parallel/test_view_sharding_dim_change ... [2022-11-23 02:08:37.156711] 2022-11-23T02:08:37.1588905Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/parallel/test_view_sharding_dim_change.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:37.157052] 2022-11-23T02:08:44.4331724Z 2022-11-23T02:08:44.4332555Z Expand the folded group to see the log file of distributed/_tensor/parallel/test_view_sharding_dim_change 2022-11-23T02:08:44.4333596Z ##[group]PRINTING LOG FILE of distributed/_tensor/parallel/test_view_sharding_dim_change (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_view_sharding_dim_change_1ulkky2g) 2022-11-23T02:08:44.4334049Z 2022-11-23T02:08:44.4334166Z Running tests... 2022-11-23T02:08:44.4334710Z ---------------------------------------------------------------------- 2022-11-23T02:08:44.4335572Z Test results will be stored in test-reports/python-unittest/distributed._tensor.parallel.test_view_sharding_dim_change 2022-11-23T02:08:44.4336187Z test_view_with_sharding_dim_change (__main__.TPViewShardingDimChangeTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:08:44.4336709Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40334 2022-11-23T02:08:44.4337173Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40335 2022-11-23T02:08:44.4337805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:44.4338264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:44.4338867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:44.4339328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:44.4339920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:44.4340377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:44.4340961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:44.4341416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:44.4341866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:08:44.4342363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:08:44.4342842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:08:44.4343330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:08:44.4343998Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:08:44.4344702Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:08:44.4345086Z ok (4.900s) 2022-11-23T02:08:44.4345237Z 2022-11-23T02:08:44.4345510Z ---------------------------------------------------------------------- 2022-11-23T02:08:44.4345844Z Ran 1 test in 4.900s 2022-11-23T02:08:44.4346011Z 2022-11-23T02:08:44.4346108Z OK 2022-11-23T02:08:44.4346226Z 2022-11-23T02:08:44.4346354Z Generating XML reports... 2022-11-23T02:08:44.4347053Z Generated XML report: test-reports/python-unittest/distributed._tensor.parallel.test_view_sharding_dim_change/TEST-TPViewShardingDimChangeTest-20221123020839.xml 2022-11-23T02:08:44.4347615Z 2022-11-23T02:08:44.4347930Z ##[endgroup] 2022-11-23T02:08:44.4348624Z FINISHED PRINTING LOG FILE of distributed/_tensor/parallel/test_view_sharding_dim_change (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_view_sharding_dim_change_1ulkky2g) 2022-11-23T02:08:44.4349046Z 2022-11-23T02:08:44.4349338Z Running distributed/fsdp/test_fsdp_pure_fp16 ... [2022-11-23 02:08:44.433316] 2022-11-23T02:08:44.4350037Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_pure_fp16.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:44.433675] 2022-11-23T02:08:52.2315766Z 2022-11-23T02:08:52.2316657Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T02:08:52.2317746Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_pure_fp16_n5ea3o8u) 2022-11-23T02:08:52.2318144Z 2022-11-23T02:08:52.2318265Z Running tests... 2022-11-23T02:08:52.2318792Z ---------------------------------------------------------------------- 2022-11-23T02:08:52.2319385Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16 2022-11-23T02:08:52.2320133Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=False) (__main__.TestPureFP16) 2022-11-23T02:08:52.2321246Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:08:52.2322314Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/73315 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.659s) 2022-11-23T02:08:52.2323174Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=True) (__main__.TestPureFP16) 2022-11-23T02:08:52.2324070Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40448 2022-11-23T02:08:52.2324615Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40449 2022-11-23T02:08:52.2325242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:52.2325705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:52.2326270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:52.2326744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:52.2327382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:08:52.2327847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:08:52.2328437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:08:52.2328896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:08:52.2329358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:08:52.2329865Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:08:52.2330521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:08:52.2331226Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:08:52.2331755Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:08:52.2332234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:08:52.2333648Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:08:52.2334434Z warnings.warn( 2022-11-23T02:08:52.2335586Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:08:52.2336368Z warnings.warn( 2022-11-23T02:08:52.2336641Z File "", line 1, in 2022-11-23T02:08:52.2337001Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:08:52.2337379Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:08:52.2337818Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:08:52.2338210Z return self._bootstrap(parent_sentinel) 2022-11-23T02:08:52.2338591Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:08:52.2338934Z self.run() 2022-11-23T02:08:52.2339270Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:08:52.2339624Z self._target(*self._args, **self._kwargs) 2022-11-23T02:08:52.2340151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:08:52.2340548Z self.run_test(test_name, pipe) 2022-11-23T02:08:52.2341069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:08:52.2341515Z getattr(self, test_name)() 2022-11-23T02:08:52.2342042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:08:52.2342415Z fn() 2022-11-23T02:08:52.2342896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:08:52.2343290Z test(self, **param_kwargs) 2022-11-23T02:08:52.2343812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:08:52.2344188Z return func(*args, **kwargs) 2022-11-23T02:08:52.2344588Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T02:08:52.2344969Z self._test_fsdp_parity( 2022-11-23T02:08:52.2345482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:08:52.2345904Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:08:52.2346463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:08:52.2346870Z output = model(*input) 2022-11-23T02:08:52.2347342Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:08:52.2347735Z return forward_call(*input, **kwargs) 2022-11-23T02:08:52.2348293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:08:52.2348737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:08:52.2349311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:08:52.2349783Z _lazy_init(state, module) 2022-11-23T02:08:52.2350297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:08:52.2350694Z handle.init_flat_param_attributes() 2022-11-23T02:08:52.2351221Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:08:52.2351605Z return func(*args, **kwargs) 2022-11-23T02:08:52.2352131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:08:52.2352518Z p_assert( 2022-11-23T02:08:52.2352996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:08:52.2353383Z traceback.print_stack() 2022-11-23T02:08:52.2353653Z File "", line 1, in 2022-11-23T02:08:52.2354028Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:08:52.2354407Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:08:52.2354764Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:08:52.2355487Z return self._bootstrap(parent_sentinel) 2022-11-23T02:08:52.2355985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:08:52.2356324Z self.run() 2022-11-23T02:08:52.2356662Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:08:52.2357030Z self._target(*self._args, **self._kwargs) 2022-11-23T02:08:52.2357561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:08:52.2357936Z self.run_test(test_name, pipe) 2022-11-23T02:08:52.2358468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:08:52.2358868Z getattr(self, test_name)() 2022-11-23T02:08:52.2359378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:08:52.2359750Z fn() 2022-11-23T02:08:52.2360247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:08:52.2360631Z test(self, **param_kwargs) 2022-11-23T02:08:52.2361153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:08:52.2361549Z return func(*args, **kwargs) 2022-11-23T02:08:52.2361947Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T02:08:52.2362307Z self._test_fsdp_parity( 2022-11-23T02:08:52.2362834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:08:52.2363264Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:08:52.2363811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:08:52.2364208Z output = model(*input) 2022-11-23T02:08:52.2364694Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:08:52.2365086Z return forward_call(*input, **kwargs) 2022-11-23T02:08:52.2365625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:08:52.2366083Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:08:52.2366659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:08:52.2367037Z _lazy_init(state, module) 2022-11-23T02:08:52.2367550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:08:52.2368068Z handle.init_flat_param_attributes() 2022-11-23T02:08:52.2368590Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:08:52.2368959Z return func(*args, **kwargs) 2022-11-23T02:08:52.2369505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:08:52.2369892Z p_assert( 2022-11-23T02:08:52.2370353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:08:52.2370735Z traceback.print_stack() 2022-11-23T02:08:52.2371006Z dist init r=0, world=2 2022-11-23T02:08:52.2371467Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:08:52.2371913Z dist init r=1, world=2 2022-11-23T02:08:52.2372385Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:08:52.2372815Z ok (3.813s) 2022-11-23T02:08:52.2372968Z 2022-11-23T02:08:52.2373230Z ---------------------------------------------------------------------- 2022-11-23T02:08:52.2373566Z Ran 2 tests in 5.472s 2022-11-23T02:08:52.2373731Z 2022-11-23T02:08:52.2373901Z OK (skipped=1) 2022-11-23T02:08:52.2374069Z 2022-11-23T02:08:52.2374195Z Generating XML reports... 2022-11-23T02:08:52.2374773Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20221123020846.xml 2022-11-23T02:08:52.2375117Z 2022-11-23T02:08:52.2375440Z ##[endgroup] 2022-11-23T02:08:52.2376058Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_pure_fp16_n5ea3o8u) 2022-11-23T02:08:52.2376401Z 2022-11-23T02:08:52.2376701Z Running distributed/elastic/timer/local_timer_test ... [2022-11-23 02:08:52.231835] 2022-11-23T02:08:52.2377415Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/timer/local_timer_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:08:52.232157] 2022-11-23T02:09:00.5749604Z 2022-11-23T02:09:00.5750136Z Expand the folded group to see the log file of distributed/elastic/timer/local_timer_test 2022-11-23T02:09:00.5751133Z ##[group]PRINTING LOG FILE of distributed/elastic/timer/local_timer_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-timer-local_timer_test_qol1w0iw) 2022-11-23T02:09:00.5751531Z 2022-11-23T02:09:00.5751651Z Running tests... 2022-11-23T02:09:00.5752174Z ---------------------------------------------------------------------- 2022-11-23T02:09:00.5752756Z Test results will be stored in test-reports/python-unittest/distributed.elastic.timer.local_timer_test 2022-11-23T02:09:00.5753227Z test_acquire_release (__main__.LocalTimerServerTest) 2022-11-23T02:09:00.5754253Z tests that: ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/87154 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.609s) 2022-11-23T02:09:00.5754960Z test_expired_timers (__main__.LocalTimerServerTest) 2022-11-23T02:09:00.5755647Z tests that a single expired timer on a process should terminate ... ok (0.004s) 2022-11-23T02:09:00.5756047Z test_valid_timers (__main__.LocalTimerServerTest) 2022-11-23T02:09:00.5756472Z tests that valid timers are processed correctly and the process is left alone ... ok (0.003s) 2022-11-23T02:09:00.5756903Z test_watchdog_call_count (__main__.LocalTimerServerTest) 2022-11-23T02:09:00.5757385Z checks that the watchdog function ran wait/interval +- 1 times ... ok (0.104s) 2022-11-23T02:09:00.5757791Z test_watchdog_empty_queue (__main__.LocalTimerServerTest) 2022-11-23T02:09:00.5758482Z checks that the watchdog can run on an empty queue ... ok (0.011s) 2022-11-23T02:09:00.5758873Z test_client_interaction (__main__.LocalTimerTest) ... ok (0.004s) 2022-11-23T02:09:00.5759254Z test_exception_propagation (__main__.LocalTimerTest) ... ok (0.011s) 2022-11-23T02:09:00.5759633Z test_get_timer_recursive (__main__.LocalTimerTest) 2022-11-23T02:09:00.5760343Z If a function acquires a countdown timer with default scope, ... /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:00.5760855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:00.5761441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:00.5761917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:00.5762233Z ok (2.412s) 2022-11-23T02:09:00.5762521Z test_happy_path (__main__.LocalTimerTest) ... ok (0.103s) 2022-11-23T02:09:00.5762890Z test_no_client (__main__.LocalTimerTest) ... ok (0.011s) 2022-11-23T02:09:00.5763241Z test_timer (__main__.LocalTimerTest) ... ok (0.156s) 2022-11-23T02:09:00.5763617Z test_get (__main__.MultiprocessingRequestQueueTest) ... ok (0.023s) 2022-11-23T02:09:00.5764062Z test_get_less_than_size (__main__.MultiprocessingRequestQueueTest) 2022-11-23T02:09:00.5764533Z Tests slow producer. ... ok (0.515s) 2022-11-23T02:09:00.5764893Z test_get_size (__main__.MultiprocessingRequestQueueTest) 2022-11-23T02:09:00.5765302Z Creates a "producer" process that enqueues ``n`` elements ... ok (0.922s) 2022-11-23T02:09:00.5765544Z 2022-11-23T02:09:00.5765823Z ---------------------------------------------------------------------- 2022-11-23T02:09:00.5766158Z Ran 14 tests in 5.892s 2022-11-23T02:09:00.5766306Z 2022-11-23T02:09:00.5766418Z OK (skipped=1) 2022-11-23T02:09:00.5766575Z 2022-11-23T02:09:00.5766705Z Generating XML reports... 2022-11-23T02:09:00.5767355Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerServerTest-20221123020854.xml 2022-11-23T02:09:00.5768164Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerTest-20221123020854.xml 2022-11-23T02:09:00.5769032Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-MultiprocessingRequestQueueTest-20221123020854.xml 2022-11-23T02:09:00.5769452Z 2022-11-23T02:09:00.5769769Z ##[endgroup] 2022-11-23T02:09:00.5770416Z FINISHED PRINTING LOG FILE of distributed/elastic/timer/local_timer_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-timer-local_timer_test_qol1w0iw) 2022-11-23T02:09:00.5770785Z 2022-11-23T02:09:00.5771110Z Running distributed/_shard/sharded_tensor/ops/test_embedding_bag ... [2022-11-23 02:09:00.575102] 2022-11-23T02:09:00.5771868Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:09:00.575434] 2022-11-23T02:09:09.3584027Z 2022-11-23T02:09:09.3584850Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-11-23T02:09:09.3585905Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_embedding_bag (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_embedding_bag_icepw2wk) 2022-11-23T02:09:09.3586336Z 2022-11-23T02:09:09.3586434Z Running tests... 2022-11-23T02:09:09.3586992Z ---------------------------------------------------------------------- 2022-11-23T02:09:09.3587611Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag 2022-11-23T02:09:09.3588193Z test_sharded_embedding_bag_colwise (__main__.TestShardedEmbeddingBag) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:09:09.3588995Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40655 2022-11-23T02:09:09.3589434Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40656 2022-11-23T02:09:09.3589889Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40657 2022-11-23T02:09:09.3590342Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40658 2022-11-23T02:09:09.3590980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3591443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3592035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3592517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3593090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3593551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3594134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3594612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3595580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3596059Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3596649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3597104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3597688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3598144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3598725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3599177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3599624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:09.3600113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:09.3600570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:09.3601045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:09.3601437Z skip: Need at least 4 CUDA devices (4.071s) 2022-11-23T02:09:09.3601947Z test_sharded_embedding_bag_rowwise (__main__.TestShardedEmbeddingBag) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40791 2022-11-23T02:09:09.3602490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40792 2022-11-23T02:09:09.3602944Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40793 2022-11-23T02:09:09.3603395Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40794 2022-11-23T02:09:09.3604021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3604462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3605046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3605521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3606086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3606654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3607242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3607714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3608286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3608736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3609323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3609771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3610352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:09.3610799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:09.3611382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:09.3611831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:09.3612277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:09.3612822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:09.3613306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:09.3613761Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:09.3614157Z skip: Need at least 4 CUDA devices (2.409s) 2022-11-23T02:09:09.3614356Z 2022-11-23T02:09:09.3614636Z ---------------------------------------------------------------------- 2022-11-23T02:09:09.3614957Z Ran 2 tests in 6.481s 2022-11-23T02:09:09.3615129Z 2022-11-23T02:09:09.3615240Z OK (skipped=2) 2022-11-23T02:09:09.3615399Z 2022-11-23T02:09:09.3615528Z Generating XML reports... 2022-11-23T02:09:09.3616182Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag/TEST-TestShardedEmbeddingBag-20221123020902.xml 2022-11-23T02:09:09.3616585Z 2022-11-23T02:09:09.3616912Z ##[endgroup] 2022-11-23T02:09:09.3617617Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_embedding_bag (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_embedding_bag_icepw2wk) 2022-11-23T02:09:09.3618034Z 2022-11-23T02:09:09.3618355Z Running distributed/_shard/sharded_tensor/ops/test_softmax ... [2022-11-23 02:09:09.358601] 2022-11-23T02:09:09.3619073Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_softmax.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:09:09.358906] 2022-11-23T02:09:18.0999185Z 2022-11-23T02:09:18.0999973Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T02:09:18.1001003Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_softmax (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_softmax_03x7szyg) 2022-11-23T02:09:18.1001656Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzyyt7w7p 2022-11-23T02:09:18.1002430Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzyyt7w7p/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1002905Z 2022-11-23T02:09:18.1003082Z Running tests... 2022-11-23T02:09:18.1004130Z ---------------------------------------------------------------------- 2022-11-23T02:09:18.1005112Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax 2022-11-23T02:09:18.1005662Z test_sharded_softmax_basic (__main__.TestShardedSoftmax) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:09:18.1006629Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40962 2022-11-23T02:09:18.1007214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40963 2022-11-23T02:09:18.1007710Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40964 2022-11-23T02:09:18.1008157Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40965 2022-11-23T02:09:18.1009027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1009488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1010072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1010533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1011118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1011573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1012154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1012607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1013641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1014118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1014688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1015157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1015738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1016193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1016752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1017219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1017687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmr5vdz7x 2022-11-23T02:09:18.1018218Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmr5vdz7x/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1018755Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppimdgjzz 2022-11-23T02:09:18.1019294Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppimdgjzz/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1019827Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp08fmr_ej 2022-11-23T02:09:18.1020342Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp08fmr_ej/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1020858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:18.1021333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:18.1021806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:18.1022290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpabv8nq37 2022-11-23T02:09:18.1022823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpabv8nq37/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1023327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:18.1023727Z skip: Need at least 4 CUDA devices (4.035s) 2022-11-23T02:09:18.1024717Z test_sharded_softmax_on_sharding_dim (__main__.TestShardedSoftmax) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41098 2022-11-23T02:09:18.1025523Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41099 2022-11-23T02:09:18.1025980Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41100 2022-11-23T02:09:18.1026413Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41101 2022-11-23T02:09:18.1027047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1027503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1028070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1028545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1029135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1029587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1030149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1030617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1031273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1031732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1032298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1032766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1033345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:18.1033776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:18.1034365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:18.1034832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:18.1035562Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6xm3v_f9 2022-11-23T02:09:18.1036097Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6xm3v_f9/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1036633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbc3ko4n1 2022-11-23T02:09:18.1037169Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbc3ko4n1/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1037723Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplpxm5f8m 2022-11-23T02:09:18.1038244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplpxm5f8m/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1038754Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:18.1039231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:18.1039716Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz1ygr0x8 2022-11-23T02:09:18.1040256Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz1ygr0x8/_remote_module_non_scriptable.py 2022-11-23T02:09:18.1040766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:18.1041233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:18.1041612Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:09:18.1041809Z 2022-11-23T02:09:18.1042097Z ---------------------------------------------------------------------- 2022-11-23T02:09:18.1042434Z Ran 2 tests in 6.446s 2022-11-23T02:09:18.1042702Z 2022-11-23T02:09:18.1042796Z OK (skipped=2) 2022-11-23T02:09:18.1042953Z 2022-11-23T02:09:18.1043076Z Generating XML reports... 2022-11-23T02:09:18.1043718Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax/TEST-TestShardedSoftmax-20221123020911.xml 2022-11-23T02:09:18.1044092Z 2022-11-23T02:09:18.1044402Z ##[endgroup] 2022-11-23T02:09:18.1045082Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_softmax (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_softmax_03x7szyg) 2022-11-23T02:09:18.1045483Z 2022-11-23T02:09:18.1045756Z Running distributed/_tensor/test_view_ops ... [2022-11-23 02:09:18.100167] 2022-11-23T02:09:18.1046430Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_view_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:09:18.100494] 2022-11-23T02:09:27.1557465Z 2022-11-23T02:09:27.1558406Z Expand the folded group to see the log file of distributed/_tensor/test_view_ops 2022-11-23T02:09:27.1559578Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_view_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_view_ops_r7c0ao3r) 2022-11-23T02:09:27.1559924Z 2022-11-23T02:09:27.1560116Z Running tests... 2022-11-23T02:09:27.1560628Z ---------------------------------------------------------------------- 2022-11-23T02:09:27.1561443Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_view_ops 2022-11-23T02:09:27.1561948Z test_view_groups (__main__.TestViewOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:09:27.1562390Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41269 2022-11-23T02:09:27.1562844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41270 2022-11-23T02:09:27.1563281Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41271 2022-11-23T02:09:27.1563726Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41272 2022-11-23T02:09:27.1564163Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 41273 2022-11-23T02:09:27.1564600Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 41274 2022-11-23T02:09:27.1565241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1565685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1566280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1566763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1567350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1567780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1568355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1568809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1569391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1569860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1570454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1570920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1571482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1571933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1572513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1573110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1573681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1574135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1574715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1575164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1575744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1576192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1576769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1577226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1577672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:27.1578153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:27.1578687Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:09:27.1579153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:09:27.1579615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:27.1580083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:27.1580412Z ok (4.137s) 2022-11-23T02:09:27.1580817Z test_view_ops (__main__.TestViewOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41473 2022-11-23T02:09:27.1581323Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41474 2022-11-23T02:09:27.1581774Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41475 2022-11-23T02:09:27.1582203Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41476 2022-11-23T02:09:27.1582642Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 41477 2022-11-23T02:09:27.1583084Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 41478 2022-11-23T02:09:27.1583687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1584142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1584730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1585252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1585832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1586278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1586856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1587315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1587902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1588347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1588913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1589347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1590009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1590482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1591056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1591528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1592110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1592556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1593116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1593584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1594167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:27.1594615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:27.1595555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:27.1596043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:27.1596583Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:09:27.1597057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:27.1597531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:27.1598016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:27.1598476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:27.1598931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:09:27.1599317Z skip: Need at least 6 CUDA devices (2.521s) 2022-11-23T02:09:27.1599513Z 2022-11-23T02:09:27.1599806Z ---------------------------------------------------------------------- 2022-11-23T02:09:27.1600150Z Ran 2 tests in 6.658s 2022-11-23T02:09:27.1600297Z 2022-11-23T02:09:27.1600414Z OK (skipped=1) 2022-11-23T02:09:27.1600569Z 2022-11-23T02:09:27.1600697Z Generating XML reports... 2022-11-23T02:09:27.1601277Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_view_ops/TEST-TestViewOps-20221123020920.xml 2022-11-23T02:09:27.1601613Z 2022-11-23T02:09:27.1601923Z ##[endgroup] 2022-11-23T02:09:27.1602522Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_view_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_view_ops_r7c0ao3r) 2022-11-23T02:09:27.1602866Z 2022-11-23T02:09:27.1603137Z Running distributed/fsdp/test_fsdp_input ... [2022-11-23 02:09:27.155923] 2022-11-23T02:09:27.1603809Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_input.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:09:27.156233] 2022-11-23T02:09:38.1782110Z 2022-11-23T02:09:38.1782821Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_input 2022-11-23T02:09:38.1783901Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_input (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_input_lthcq8td) 2022-11-23T02:09:38.1784373Z 2022-11-23T02:09:38.1784580Z Running tests... 2022-11-23T02:09:38.1785208Z ---------------------------------------------------------------------- 2022-11-23T02:09:38.1785795Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_input 2022-11-23T02:09:38.1786223Z test_input_type_dict (__main__.TestInput) 2022-11-23T02:09:38.1786648Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:09:38.1787426Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41712 2022-11-23T02:09:38.1788311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:38.1788806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:38.1789410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:38.1789876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:38.1790348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:09:38.1791023Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:09:38.1791554Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:38.1792934Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:09:38.1793742Z warnings.warn( 2022-11-23T02:09:38.1794000Z dist init r=0, world=1 2022-11-23T02:09:38.1794244Z ok (5.172s) 2022-11-23T02:09:38.1794503Z test_input_type_list (__main__.TestInput) 2022-11-23T02:09:38.1794985Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41754 2022-11-23T02:09:38.1796059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:38.1796523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:38.1797096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:38.1797574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:38.1798045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:09:38.1798717Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:09:38.1799231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:38.1800502Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:09:38.1801285Z warnings.warn( 2022-11-23T02:09:38.1801545Z dist init r=0, world=1 2022-11-23T02:09:38.1801773Z ok (3.509s) 2022-11-23T02:09:38.1801924Z 2022-11-23T02:09:38.1802202Z ---------------------------------------------------------------------- 2022-11-23T02:09:38.1802545Z Ran 2 tests in 8.681s 2022-11-23T02:09:38.1802711Z 2022-11-23T02:09:38.1802806Z OK 2022-11-23T02:09:38.1802923Z 2022-11-23T02:09:38.1803050Z Generating XML reports... 2022-11-23T02:09:38.1803624Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20221123020929.xml 2022-11-23T02:09:38.1803956Z 2022-11-23T02:09:38.1804280Z ##[endgroup] 2022-11-23T02:09:38.1804864Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_input (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_input_lthcq8td) 2022-11-23T02:09:38.1805338Z 2022-11-23T02:09:38.1805646Z Running distributed/_shard/sharded_tensor/ops/test_init ... [2022-11-23 02:09:38.178342] 2022-11-23T02:09:38.1806373Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:09:38.178707] 2022-11-23T02:09:49.3881664Z 2022-11-23T02:09:49.3882474Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_init 2022-11-23T02:09:49.3883996Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_init (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_init_rmkfmldk) 2022-11-23T02:09:49.3884427Z 2022-11-23T02:09:49.3884549Z Running tests... 2022-11-23T02:09:49.3885308Z ---------------------------------------------------------------------- 2022-11-23T02:09:49.3886248Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init 2022-11-23T02:09:49.3887335Z test_init_sharded_tensor_with_kaiming_uniform (__main__.TestShardedTensorNNInit) 2022-11-23T02:09:49.3888378Z Test torch.nn.init.kaiming_uniform_(ShardedTensor, a, mode, nonlinearit) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:09:49.3889425Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41831 2022-11-23T02:09:49.3890240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41832 2022-11-23T02:09:49.3891259Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41833 2022-11-23T02:09:49.3891965Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41834 2022-11-23T02:09:49.3892611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3893076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3893782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3894259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3894832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3895298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3895881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3896336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3896917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3897372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3897953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3898410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3898993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3899445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3900028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3900479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3900925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:49.3901406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:49.3901862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:49.3902493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:49.3902885Z skip: Need at least 4 CUDA devices (4.073s) 2022-11-23T02:09:49.3903277Z test_init_sharded_tensor_with_normal (__main__.TestShardedTensorNNInit) 2022-11-23T02:09:49.3903799Z Test torch.nn.init.normal_(ShardedTensor, mean, std) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41967 2022-11-23T02:09:49.3904334Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41968 2022-11-23T02:09:49.3904785Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41969 2022-11-23T02:09:49.3905214Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41970 2022-11-23T02:09:49.3905839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3906300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3906890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3907350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3908001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3908467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3909056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3909550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3910138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3910591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3911182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3911635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3912213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3912666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3913230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3913699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3914141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:49.3914625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:49.3915511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:49.3916018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:49.3916421Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:09:49.3916792Z test_init_sharded_tensor_with_uniform (__main__.TestShardedTensorNNInit) 2022-11-23T02:09:49.3917325Z Test torch.nn.init.uniform_(ShardedTensor, a, b) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42103 2022-11-23T02:09:49.3917848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42104 2022-11-23T02:09:49.3918298Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42105 2022-11-23T02:09:49.3918725Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42106 2022-11-23T02:09:49.3919350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3919937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3920530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3920989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3921585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3922033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3922597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3923071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3923658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3924104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3924674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3925140Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3925724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:09:49.3926265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:09:49.3926867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:09:49.3927337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:09:49.3927776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:09:49.3928238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:09:49.3928712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:09:49.3929184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:09:49.3929561Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:09:49.3929757Z 2022-11-23T02:09:49.3930038Z ---------------------------------------------------------------------- 2022-11-23T02:09:49.3930377Z Ran 3 tests in 8.894s 2022-11-23T02:09:49.3930546Z 2022-11-23T02:09:49.3930659Z OK (skipped=3) 2022-11-23T02:09:49.3930815Z 2022-11-23T02:09:49.3930923Z Generating XML reports... 2022-11-23T02:09:49.3931575Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20221123020940.xml 2022-11-23T02:09:49.3931967Z 2022-11-23T02:09:49.3932308Z ##[endgroup] 2022-11-23T02:09:49.3932944Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_init (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_init_rmkfmldk) 2022-11-23T02:09:49.3933341Z 2022-11-23T02:09:49.3933655Z Running distributed/_shard/sharded_tensor/ops/test_binary_cmp ... [2022-11-23 02:09:49.388384] 2022-11-23T02:09:49.3934405Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:09:49.388779] 2022-11-23T02:10:02.9710971Z 2022-11-23T02:10:02.9711713Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-11-23T02:10:02.9712749Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_binary_cmp (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_binary_cmp_3paler4u) 2022-11-23T02:10:02.9713163Z 2022-11-23T02:10:02.9713317Z Running tests... 2022-11-23T02:10:02.9713833Z ---------------------------------------------------------------------- 2022-11-23T02:10:02.9714706Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp 2022-11-23T02:10:02.9715558Z test_torch_allclose (__main__.TestShardedTensorBinaryOps) 2022-11-23T02:10:02.9716022Z Test torch.allclose(ShardedTensor, ShardedTensor) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:10:02.9716508Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42274 2022-11-23T02:10:02.9716956Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42275 2022-11-23T02:10:02.9717399Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42276 2022-11-23T02:10:02.9717848Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42277 2022-11-23T02:10:02.9718501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9718943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9719535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9720012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9720585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9721148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9721745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9722219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9722784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9723232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9723805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9724261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9724843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9725295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9725879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9726332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9726777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:02.9727254Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:02.9727707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:10:02.9728183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:10:02.9728569Z skip: Need at least 4 CUDA devices (4.053s) 2022-11-23T02:10:02.9729085Z test_torch_allclose_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42410 2022-11-23T02:10:02.9729634Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42411 2022-11-23T02:10:02.9730083Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42412 2022-11-23T02:10:02.9730527Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42413 2022-11-23T02:10:02.9731145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9731585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9732168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9732755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9733329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9733778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9734361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9734833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9735466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9735899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9736474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9736950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9737536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9737965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9738601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9739081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9739503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:10:02.9739981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:02.9740457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:10:02.9740924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:02.9741306Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:10:02.9741668Z test_torch_equal (__main__.TestShardedTensorBinaryOps) 2022-11-23T02:10:02.9742173Z Test torch.equal(ShardedTensor, ShardedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42546 2022-11-23T02:10:02.9742688Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42547 2022-11-23T02:10:02.9743139Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42548 2022-11-23T02:10:02.9743581Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42549 2022-11-23T02:10:02.9744210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9744649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9745229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9745715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9746283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9746733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9747303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9747751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9748300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9748744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9749321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9749852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9750449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9750924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9751518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9751969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9752407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:10:02.9752965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:02.9753440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:10:02.9753894Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:02.9754291Z skip: Need at least 4 CUDA devices (2.409s) 2022-11-23T02:10:02.9754805Z test_torch_equal_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42682 2022-11-23T02:10:02.9755619Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42683 2022-11-23T02:10:02.9756090Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42684 2022-11-23T02:10:02.9756533Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42685 2022-11-23T02:10:02.9757166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9757603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9758186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9758668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9759238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9759688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9760272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9760740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9761303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9761754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9762331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9762803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9763366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:02.9763811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:02.9764394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:02.9764844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:02.9765284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:02.9765760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:02.9766232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:10:02.9766683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:10:02.9767158Z skip: Need at least 4 CUDA devices (2.409s) 2022-11-23T02:10:02.9767355Z 2022-11-23T02:10:02.9767650Z ---------------------------------------------------------------------- 2022-11-23T02:10:02.9767970Z Ran 4 tests in 11.282s 2022-11-23T02:10:02.9768134Z 2022-11-23T02:10:02.9768245Z OK (skipped=4) 2022-11-23T02:10:02.9768400Z 2022-11-23T02:10:02.9768530Z Generating XML reports... 2022-11-23T02:10:02.9769213Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20221123020951.xml 2022-11-23T02:10:02.9769621Z 2022-11-23T02:10:02.9769922Z ##[endgroup] 2022-11-23T02:10:02.9770604Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_binary_cmp (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_binary_cmp_3paler4u) 2022-11-23T02:10:02.9771008Z 2022-11-23T02:10:02.9771287Z Running distributed/fsdp/test_fsdp_overlap ... [2022-11-23 02:10:02.971277] 2022-11-23T02:10:02.9771965Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_overlap.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:10:02.971664] 2022-11-23T02:10:18.3536921Z 2022-11-23T02:10:18.3537450Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_overlap 2022-11-23T02:10:18.3538826Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_overlap (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_overlap_g8zspd0u) 2022-11-23T02:10:18.3539230Z 2022-11-23T02:10:18.3539350Z Running tests... 2022-11-23T02:10:18.3539860Z ---------------------------------------------------------------------- 2022-11-23T02:10:18.3540442Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap 2022-11-23T02:10:18.3540989Z test_forward_overlap (__main__.TestForwardOverlapWorldSizeOne) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:10:18.3541494Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42853 2022-11-23T02:10:18.3542145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:18.3542612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:18.3543214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:18.3543678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:18.3544171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:18.3544841Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:10:18.3545376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:18.3546639Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:10:18.3547441Z warnings.warn( 2022-11-23T02:10:18.3547681Z dist init r=0, world=1 2022-11-23T02:10:18.3547848Z 2022-11-23T02:10:18.3547947Z rank0: 2022-11-23T02:10:18.3548456Z e1: {'cpu_iter': 0.0018582047000002433, 'cpu_wait': 3.619379999992844e-05, 'gpu_compute': 0.06777920052409173, 'gpu_total': 0.7625760018825531} 2022-11-23T02:10:18.3549034Z e2: {'cpu_iter': 0.005753082700000167, 'cpu_wait': 3.461109999998157e-05, 'gpu_compute': 0.266083200648427, 'gpu_total': 2.482361626625061} 2022-11-23T02:10:18.3549621Z e3: {'cpu_iter': 0.0019783122000003317, 'cpu_wait': 0.18314721000000028, 'gpu_compute': 185.5997184753418, 'gpu_total': 185.88284759521486} 2022-11-23T02:10:18.3550336Z e4: {'cpu_iter': 0.005821925400000083, 'cpu_wait': 0.18041971310000005, 'gpu_compute': 185.6111976623535, 'gpu_total': 186.17030334472656} 2022-11-23T02:10:18.3550671Z ok (13.039s) 2022-11-23T02:10:18.3551700Z test_forward_overlap (__main__.TestForwardOverlapWorldSizeTwo) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/71183 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:10:18.3552312Z 2022-11-23T02:10:18.3552586Z ---------------------------------------------------------------------- 2022-11-23T02:10:18.3552920Z Ran 2 tests in 13.040s 2022-11-23T02:10:18.3553088Z 2022-11-23T02:10:18.3553201Z OK (skipped=1) 2022-11-23T02:10:18.3553358Z 2022-11-23T02:10:18.3553472Z Generating XML reports... 2022-11-23T02:10:18.3554127Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeOne-20221123021004.xml 2022-11-23T02:10:18.3554999Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeTwo-20221123021004.xml 2022-11-23T02:10:18.3555684Z 2022-11-23T02:10:18.3556089Z ##[endgroup] 2022-11-23T02:10:18.3556733Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_overlap (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_overlap_g8zspd0u) 2022-11-23T02:10:18.3557097Z 2022-11-23T02:10:18.3557399Z Running distributed/_tensor/parallel/test_tp_examples ... [2022-11-23 02:10:18.353863] 2022-11-23T02:10:18.3558146Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/parallel/test_tp_examples.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:10:18.354214] 2022-11-23T02:10:33.9956772Z 2022-11-23T02:10:33.9957330Z Expand the folded group to see the log file of distributed/_tensor/parallel/test_tp_examples 2022-11-23T02:10:33.9958337Z ##[group]PRINTING LOG FILE of distributed/_tensor/parallel/test_tp_examples (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_tp_examples_8yddbjt1) 2022-11-23T02:10:33.9958743Z 2022-11-23T02:10:33.9958862Z Running tests... 2022-11-23T02:10:33.9959616Z ---------------------------------------------------------------------- 2022-11-23T02:10:33.9960429Z Test results will be stored in test-reports/python-unittest/distributed._tensor.parallel.test_tp_examples 2022-11-23T02:10:33.9960977Z test_mlp_megatron_e2e (__main__.DistTensorParallelExampleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:10:33.9961492Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42930 2022-11-23T02:10:33.9961953Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42931 2022-11-23T02:10:33.9962591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:33.9963411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:33.9964227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:33.9964716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:33.9965305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:33.9965757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:33.9966322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:33.9966802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:33.9967249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:33.9968019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:33.9968494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:33.9968986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:33.9969663Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:33.9970341Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:33.9970739Z ok (5.417s) 2022-11-23T02:10:33.9971209Z test_self_attn_megatron_e2e (__main__.DistTensorParallelExampleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43015 2022-11-23T02:10:33.9971771Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43016 2022-11-23T02:10:33.9972378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:33.9972838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:33.9973419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:33.9973992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:33.9974577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:33.9975025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:33.9975604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:33.9976061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:33.9976509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:33.9977005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:33.9977497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:33.9977971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:33.9978635Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:33.9979333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:33.9979711Z ok (3.913s) 2022-11-23T02:10:33.9980199Z test_self_attn_replacement_megatron_e2e (__main__.DistTensorParallelExampleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43100 2022-11-23T02:10:33.9980786Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43101 2022-11-23T02:10:33.9981405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:33.9981844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:33.9982437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:33.9982912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:33.9983501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:33.9983935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:33.9984513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:33.9984982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:33.9985484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:33.9985985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:33.9986469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:33.9986964Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:33.9987606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:33.9988347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:33.9988746Z ok (3.914s) 2022-11-23T02:10:33.9988898Z 2022-11-23T02:10:33.9989172Z ---------------------------------------------------------------------- 2022-11-23T02:10:33.9989494Z Ran 3 tests in 13.245s 2022-11-23T02:10:33.9989659Z 2022-11-23T02:10:33.9989754Z OK 2022-11-23T02:10:33.9989891Z 2022-11-23T02:10:33.9990018Z Generating XML reports... 2022-11-23T02:10:33.9990679Z Generated XML report: test-reports/python-unittest/distributed._tensor.parallel.test_tp_examples/TEST-DistTensorParallelExampleTest-20221123021020.xml 2022-11-23T02:10:33.9991095Z 2022-11-23T02:10:33.9991485Z ##[endgroup] 2022-11-23T02:10:33.9992153Z FINISHED PRINTING LOG FILE of distributed/_tensor/parallel/test_tp_examples (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_tp_examples_8yddbjt1) 2022-11-23T02:10:33.9992545Z 2022-11-23T02:10:33.9992852Z Running distributed/checkpoint/test_file_system_checkpoint_cpu ... [2022-11-23 02:10:33.995807] 2022-11-23T02:10:33.9993622Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/checkpoint/test_file_system_checkpoint_cpu.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:10:33.996187] 2022-11-23T02:10:49.9209690Z 2022-11-23T02:10:49.9210520Z Expand the folded group to see the log file of distributed/checkpoint/test_file_system_checkpoint_cpu 2022-11-23T02:10:49.9211884Z ##[group]PRINTING LOG FILE of distributed/checkpoint/test_file_system_checkpoint_cpu (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_file_system_checkpoint_cpu_0r2ic4tp) 2022-11-23T02:10:49.9212355Z 2022-11-23T02:10:49.9212515Z Running tests... 2022-11-23T02:10:49.9213287Z ---------------------------------------------------------------------- 2022-11-23T02:10:49.9213988Z Test results will be stored in test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu 2022-11-23T02:10:49.9214754Z test_load_rowwise_to_colwise (__main__.TestDistributedReshardOnLoad) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:10:49.9215319Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43220 2022-11-23T02:10:49.9215981Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43221 2022-11-23T02:10:49.9216678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9217351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9217971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9218692Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9219280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9219982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9220573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9221377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9222098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:49.9222876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:49.9223375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:49.9224215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:49.9224998Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9225849Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9226350Z ok (3.951s) 2022-11-23T02:10:49.9226966Z test_load_with_different_shard_plan (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43294 2022-11-23T02:10:49.9227561Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43295 2022-11-23T02:10:49.9228402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9228890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9229803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9230427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9231164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9231683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9232438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9232946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9233605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:49.9234071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:49.9234817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:49.9235870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:49.9236690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9237612Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9238021Z ok (2.510s) 2022-11-23T02:10:49.9238651Z test_save_load_bytes (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43368 2022-11-23T02:10:49.9239272Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43369 2022-11-23T02:10:49.9240119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9240596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9241397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9241890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9242657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9243183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9243913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9244688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9245133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:49.9245858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:49.9246355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:49.9247092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:49.9247776Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9248733Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9249116Z ok (2.308s) 2022-11-23T02:10:49.9249852Z test_switch_between_sharded_tensor_to_tensor (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43442 2022-11-23T02:10:49.9250438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43443 2022-11-23T02:10:49.9251316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9251875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9252717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9253195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9254032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9254469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9255324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9255797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9256473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:49.9256967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:49.9257702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:49.9258258Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:49.9259153Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9259874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9260521Z ok (2.509s) 2022-11-23T02:10:49.9260881Z test_read_write_only_tensor (__main__.TestDistributedStateDictSaveLoad) ... ok (0.020s) 2022-11-23T02:10:49.9261745Z test_read_write_shard_tensor (__main__.TestDistributedStateDictSaveLoadWithSharedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43516 2022-11-23T02:10:49.9262402Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43517 2022-11-23T02:10:49.9263319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9263765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9264606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9265091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9265932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:10:49.9266469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:10:49.9267326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:10:49.9267812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:10:49.9268425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:10:49.9268956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:10:49.9269497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:10:49.9270217Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:10:49.9270976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9271825Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:10:49.9272324Z ok (2.308s) 2022-11-23T02:10:49.9272596Z 2022-11-23T02:10:49.9272884Z ---------------------------------------------------------------------- 2022-11-23T02:10:49.9273279Z Ran 6 tests in 13.607s 2022-11-23T02:10:49.9273463Z 2022-11-23T02:10:49.9273624Z OK 2022-11-23T02:10:49.9273879Z 2022-11-23T02:10:49.9274054Z Generating XML reports... 2022-11-23T02:10:49.9274752Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedReshardOnLoad-20221123021035.xml 2022-11-23T02:10:49.9276150Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoad-20221123021035.xml 2022-11-23T02:10:49.9277469Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20221123021035.xml 2022-11-23T02:10:49.9278126Z 2022-11-23T02:10:49.9278536Z ##[endgroup] 2022-11-23T02:10:49.9279234Z FINISHED PRINTING LOG FILE of distributed/checkpoint/test_file_system_checkpoint_cpu (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_file_system_checkpoint_cpu_0r2ic4tp) 2022-11-23T02:10:49.9279653Z 2022-11-23T02:10:49.9279942Z Running distributed/_tensor/test_pointwise_ops ... [2022-11-23 02:10:49.921058] 2022-11-23T02:10:49.9280911Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_pointwise_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:10:49.921454] 2022-11-23T02:11:07.1190033Z 2022-11-23T02:11:07.1190922Z Expand the folded group to see the log file of distributed/_tensor/test_pointwise_ops 2022-11-23T02:11:07.1192104Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_pointwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_pointwise_ops_gctzycjm) 2022-11-23T02:11:07.1192510Z 2022-11-23T02:11:07.1192625Z Running tests... 2022-11-23T02:11:07.1193360Z ---------------------------------------------------------------------- 2022-11-23T02:11:07.1193938Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_pointwise_ops 2022-11-23T02:11:07.1194697Z test_activations (__main__.DistElementwiseOpsTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:11:07.1195567Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43625 2022-11-23T02:11:07.1196235Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43626 2022-11-23T02:11:07.1196959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1197570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1198743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1199236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1200073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1200533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1201371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1201858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1202507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:07.1203003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:07.1203614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:07.1204210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:07.1205088Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1205929Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1206575Z ok (4.891s) 2022-11-23T02:11:07.1207033Z test_dropout (__main__.DistElementwiseOpsTest) ... skip: testing RNG based ops is broken: https://github.com/pytorch/tau/issues/494 (0.001s) 2022-11-23T02:11:07.1207850Z test_dropout_backward (__main__.DistElementwiseOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43706 2022-11-23T02:11:07.1208392Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43707 2022-11-23T02:11:07.1209238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1209703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1210495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1210978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1211783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1212215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1213021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1213499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1214153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:07.1214652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:07.1215283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:07.1215855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:07.1216730Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1217453Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1218075Z ok (3.409s) 2022-11-23T02:11:07.1218519Z test_dropout_errors (__main__.DistElementwiseOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43787 2022-11-23T02:11:07.1219215Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43788 2022-11-23T02:11:07.1219977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1220651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1221249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1221926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1222527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1223172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1223758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1224396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1224896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:07.1225413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:07.1226082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:07.1226652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:07.1227558Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1228487Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1228875Z ok (3.309s) 2022-11-23T02:11:07.1229352Z test_mul_out (__main__.DistElementwiseOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43866 2022-11-23T02:11:07.1230065Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43867 2022-11-23T02:11:07.1230864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1231356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1232052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1232639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1233290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:07.1233883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:07.1234494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:07.1235326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:07.1235933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:07.1236552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:07.1237076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:07.1237748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:07.1238523Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1239346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:07.1239860Z ok (3.310s) 2022-11-23T02:11:07.1240112Z 2022-11-23T02:11:07.1240378Z ---------------------------------------------------------------------- 2022-11-23T02:11:07.1240716Z Ran 5 tests in 14.920s 2022-11-23T02:11:07.1241086Z 2022-11-23T02:11:07.1241289Z OK (skipped=1) 2022-11-23T02:11:07.1241509Z 2022-11-23T02:11:07.1241620Z Generating XML reports... 2022-11-23T02:11:07.1242273Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_pointwise_ops/TEST-DistElementwiseOpsTest-20221123021051.xml 2022-11-23T02:11:07.1242864Z 2022-11-23T02:11:07.1243200Z ##[endgroup] 2022-11-23T02:11:07.1244041Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_pointwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_pointwise_ops_gctzycjm) 2022-11-23T02:11:07.1244426Z 2022-11-23T02:11:07.1244713Z Running distributed/test_dynamo_distributed ... [2022-11-23 02:11:07.119093] 2022-11-23T02:11:07.1245650Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_dynamo_distributed.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:11:07.119424] 2022-11-23T02:11:24.9595219Z 2022-11-23T02:11:24.9596442Z Expand the folded group to see the log file of distributed/test_dynamo_distributed 2022-11-23T02:11:24.9597422Z ##[group]PRINTING LOG FILE of distributed/test_dynamo_distributed (/var/lib/jenkins/workspace/test/test-reports/distributed-test_dynamo_distributed_kagpuq9v) 2022-11-23T02:11:24.9597797Z 2022-11-23T02:11:24.9597898Z Running tests... 2022-11-23T02:11:24.9598421Z ---------------------------------------------------------------------- 2022-11-23T02:11:24.9599173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:24.9599906Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:11:24.9600566Z Test results will be stored in test-reports/python-unittest/distributed.test_dynamo_distributed 2022-11-23T02:11:24.9600994Z test_aot_autograd (__main__.TestDistributed) 2022-11-23T02:11:24.9601431Z Explicitly check AotAutograd family of compilers work, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:11:24.9602074Z [2022-11-23 02:11:11,944] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/contextlib.py 2022-11-23T02:11:24.9602682Z [2022-11-23 02:11:11,945] torch._dynamo.eval_frame: [DEBUG] skipping __enter__ /opt/conda/lib/python3.10/contextlib.py 2022-11-23T02:11:24.9603301Z [2022-11-23 02:11:11,945] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/contextlib.py 2022-11-23T02:11:24.9603909Z [2022-11-23 02:11:11,945] torch._dynamo.eval_frame: [DEBUG] skipping __enter__ /opt/conda/lib/python3.10/contextlib.py 2022-11-23T02:11:24.9604586Z [2022-11-23 02:11:11,945] torch._dynamo.eval_frame: [DEBUG] skipping enable_dynamic /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9605213Z [2022-11-23 02:11:11,947] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing opt_fn 2022-11-23T02:11:24.9605902Z [2022-11-23 02:11:11,947] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:434 2022-11-23T02:11:24.9606525Z [2022-11-23 02:11:11,947] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF ddp_m [] 2022-11-23T02:11:24.9607203Z [2022-11-23 02:11:11,947] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9608000Z [2022-11-23 02:11:11,948] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UnspecializedNNModuleVariable(DistributedDataParallel), TensorVariable()] 2022-11-23T02:11:24.9608891Z [2022-11-23 02:11:11,950] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:24.9609406Z 1057 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9609721Z 2 LOAD_ATTR 1 (autograd) 2022-11-23T02:11:24.9610117Z 4 LOAD_ATTR 2 (profiler) 2022-11-23T02:11:24.9624638Z 6 LOAD_METHOD 3 (record_function) 2022-11-23T02:11:24.9624858Z 2022-11-23T02:11:24.9625202Z 1058 8 LOAD_CONST 1 ('DistributedDataParallel.forward') 2022-11-23T02:11:24.9625454Z 2022-11-23T02:11:24.9625573Z 1057 10 CALL_METHOD 1 2022-11-23T02:11:24.9625890Z 12 SETUP_WITH 147 (to 308) 2022-11-23T02:11:24.9626175Z 14 POP_TOP 2022-11-23T02:11:24.9626337Z 2022-11-23T02:11:24.9626480Z 1060 16 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9626780Z 18 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:24.9627092Z 20 CALL_METHOD 0 2022-11-23T02:11:24.9627399Z 22 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:24.9627686Z 24 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9628020Z 26 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:24.9628365Z 28 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:24.9628556Z 2022-11-23T02:11:24.9628674Z 1061 30 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9628976Z 32 LOAD_ATTR 6 (logger) 2022-11-23T02:11:24.9629274Z 34 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9629681Z 36 IS_OP 1 2022-11-23T02:11:24.9629981Z 38 POP_JUMP_IF_TRUE 22 (to 44) 2022-11-23T02:11:24.9630280Z 40 LOAD_ASSERTION_ERROR 2022-11-23T02:11:24.9630576Z 42 RAISE_VARARGS 1 2022-11-23T02:11:24.9630756Z 2022-11-23T02:11:24.9630874Z 1062 >> 44 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9631174Z 46 LOAD_ATTR 6 (logger) 2022-11-23T02:11:24.9631510Z 48 LOAD_METHOD 7 (set_runtime_stats_and_log) 2022-11-23T02:11:24.9632099Z 50 CALL_METHOD 0 2022-11-23T02:11:24.9632389Z 52 POP_TOP 2022-11-23T02:11:24.9632551Z 2022-11-23T02:11:24.9632689Z 1063 54 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9632959Z 56 DUP_TOP 2022-11-23T02:11:24.9633231Z 58 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9633540Z 60 LOAD_CONST 2 (1) 2022-11-23T02:11:24.9633818Z 62 INPLACE_ADD 2022-11-23T02:11:24.9634050Z 64 ROT_TWO 2022-11-23T02:11:24.9634343Z 66 STORE_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9634537Z 2022-11-23T02:11:24.9634670Z 1064 68 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9634950Z 70 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9635716Z 72 LOAD_METHOD 10 (prepare_for_forward) 2022-11-23T02:11:24.9636043Z 74 CALL_METHOD 0 2022-11-23T02:11:24.9636304Z 76 POP_TOP 2022-11-23T02:11:24.9636451Z 2022-11-23T02:11:24.9636591Z 1068 >> 78 LOAD_GLOBAL 11 (Join) 2022-11-23T02:11:24.9636914Z 80 LOAD_METHOD 12 (notify_join_context) 2022-11-23T02:11:24.9637227Z 82 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9637495Z 84 CALL_METHOD 1 2022-11-23T02:11:24.9637785Z 86 STORE_FAST 3 (work) 2022-11-23T02:11:24.9637965Z 2022-11-23T02:11:24.9638104Z 1069 88 LOAD_FAST 3 (work) 2022-11-23T02:11:24.9638394Z 90 POP_JUMP_IF_FALSE 54 (to 108) 2022-11-23T02:11:24.9638590Z 2022-11-23T02:11:24.9638724Z 1070 92 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9639024Z 94 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9639358Z 96 LOAD_METHOD 13 (_set_forward_pass_work_handle) 2022-11-23T02:11:24.9639578Z 2022-11-23T02:11:24.9639697Z 1071 98 LOAD_FAST 3 (work) 2022-11-23T02:11:24.9640121Z 100 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9640445Z 102 LOAD_ATTR 14 (_divide_by_initial_world_size) 2022-11-23T02:11:24.9640666Z 2022-11-23T02:11:24.9640800Z 1070 104 CALL_METHOD 2 2022-11-23T02:11:24.9641050Z 106 POP_TOP 2022-11-23T02:11:24.9641211Z 2022-11-23T02:11:24.9641359Z 1080 >> 108 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9641672Z 110 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:24.9641959Z 112 CALL_METHOD 0 2022-11-23T02:11:24.9642263Z 114 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:24.9642562Z 116 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9642844Z 118 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9643157Z 120 LOAD_METHOD 15 (_rebuild_buckets) 2022-11-23T02:11:24.9643464Z 122 CALL_METHOD 0 2022-11-23T02:11:24.9643751Z 124 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:24.9643939Z 2022-11-23T02:11:24.9644076Z 1081 126 LOAD_GLOBAL 6 (logger) 2022-11-23T02:11:24.9644376Z 128 LOAD_METHOD 16 (info) 2022-11-23T02:11:24.9644554Z 2022-11-23T02:11:24.9644896Z 1082 130 LOAD_CONST 3 ('Reducer buckets have been rebuilt in this iteration.') 2022-11-23T02:11:24.9645222Z 2022-11-23T02:11:24.9645346Z 1081 132 CALL_METHOD 1 2022-11-23T02:11:24.9645628Z 134 POP_TOP 2022-11-23T02:11:24.9645788Z 2022-11-23T02:11:24.9645924Z 1084 136 LOAD_CONST 4 (True) 2022-11-23T02:11:24.9646223Z 138 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9646515Z 140 STORE_ATTR 17 (_has_rebuilt_buckets) 2022-11-23T02:11:24.9646714Z 2022-11-23T02:11:24.9646863Z 1088 >> 142 LOAD_GLOBAL 18 (hasattr) 2022-11-23T02:11:24.9647156Z 144 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9647530Z 146 LOAD_CONST 5 ('buffer_hook') 2022-11-23T02:11:24.9647847Z 148 CALL_FUNCTION 2 2022-11-23T02:11:24.9648177Z 150 STORE_FAST 4 (buffer_hook_registered) 2022-11-23T02:11:24.9648384Z 2022-11-23T02:11:24.9648500Z 1089 152 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9648829Z 154 LOAD_METHOD 19 (_check_sync_bufs_pre_fwd) 2022-11-23T02:11:24.9649150Z 156 CALL_METHOD 0 2022-11-23T02:11:24.9649450Z 158 POP_JUMP_IF_FALSE 84 (to 168) 2022-11-23T02:11:24.9649622Z 2022-11-23T02:11:24.9649755Z 1090 160 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9650064Z 162 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:24.9650360Z 164 CALL_METHOD 0 2022-11-23T02:11:24.9650609Z 166 POP_TOP 2022-11-23T02:11:24.9650771Z 2022-11-23T02:11:24.9650908Z 1092 >> 168 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9651213Z 170 LOAD_ATTR 21 (_join_config) 2022-11-23T02:11:24.9651515Z 172 LOAD_ATTR 22 (enable) 2022-11-23T02:11:24.9651805Z 174 POP_JUMP_IF_FALSE 94 (to 188) 2022-11-23T02:11:24.9651986Z 2022-11-23T02:11:24.9652125Z 1094 176 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9652478Z 178 LOAD_ATTR 23 (_check_global_requires_backward_grad_sync) 2022-11-23T02:11:24.9652707Z 2022-11-23T02:11:24.9652827Z 1095 180 LOAD_CONST 6 (False) 2022-11-23T02:11:24.9653015Z 2022-11-23T02:11:24.9653257Z 1094 182 LOAD_CONST 7 (('is_joined_rank',)) 2022-11-23T02:11:24.9653578Z 184 CALL_FUNCTION_KW 1 2022-11-23T02:11:24.9653851Z 186 POP_TOP 2022-11-23T02:11:24.9653993Z 2022-11-23T02:11:24.9654128Z 1098 >> 188 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9654522Z 190 LOAD_ATTR 24 (_run_ddp_forward) 2022-11-23T02:11:24.9654829Z 192 LOAD_FAST 1 (inputs) 2022-11-23T02:11:24.9655109Z 194 BUILD_MAP 0 2022-11-23T02:11:24.9655398Z 196 LOAD_FAST 2 (kwargs) 2022-11-23T02:11:24.9655688Z 198 DICT_MERGE 1 2022-11-23T02:11:24.9655957Z 200 CALL_FUNCTION_EX 1 2022-11-23T02:11:24.9656257Z 202 STORE_FAST 5 (output) 2022-11-23T02:11:24.9656441Z 2022-11-23T02:11:24.9656576Z 1102 204 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9656907Z 206 LOAD_METHOD 25 (_check_sync_bufs_post_fwd) 2022-11-23T02:11:24.9657209Z 208 CALL_METHOD 0 2022-11-23T02:11:24.9657506Z 210 POP_JUMP_IF_FALSE 110 (to 220) 2022-11-23T02:11:24.9657696Z 2022-11-23T02:11:24.9657832Z 1103 212 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9658126Z 214 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:24.9658433Z 216 CALL_METHOD 0 2022-11-23T02:11:24.9658696Z 218 POP_TOP 2022-11-23T02:11:24.9658859Z 2022-11-23T02:11:24.9658981Z 1105 >> 220 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9659293Z 222 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:24.9659667Z 224 CALL_METHOD 0 2022-11-23T02:11:24.9659989Z 226 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:24.9660279Z 228 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9660606Z 230 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:24.9660945Z 232 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:24.9661132Z 2022-11-23T02:11:24.9661250Z 1106 234 LOAD_CONST 4 (True) 2022-11-23T02:11:24.9661547Z 236 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9661873Z 238 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:24.9662087Z 2022-11-23T02:11:24.9662221Z 1112 240 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9662522Z 242 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:24.9662856Z 244 POP_JUMP_IF_FALSE 137 (to 274) 2022-11-23T02:11:24.9663164Z 246 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9663447Z 248 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9663757Z 250 POP_JUMP_IF_TRUE 137 (to 274) 2022-11-23T02:11:24.9663949Z 2022-11-23T02:11:24.9664083Z 1114 252 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9664355Z 254 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9664685Z 256 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:24.9664890Z 2022-11-23T02:11:24.9665029Z 1115 258 LOAD_GLOBAL 30 (list) 2022-11-23T02:11:24.9665352Z 260 LOAD_GLOBAL 31 (_find_tensors) 2022-11-23T02:11:24.9665644Z 262 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9665940Z 264 CALL_FUNCTION 1 2022-11-23T02:11:24.9666233Z 266 CALL_FUNCTION 1 2022-11-23T02:11:24.9666417Z 2022-11-23T02:11:24.9666533Z 1114 268 CALL_METHOD 1 2022-11-23T02:11:24.9666791Z 270 POP_TOP 2022-11-23T02:11:24.9667074Z 272 JUMP_FORWARD 10 (to 294) 2022-11-23T02:11:24.9667256Z 2022-11-23T02:11:24.9667374Z 1118 >> 274 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9667674Z 276 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9667996Z 278 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:24.9668316Z 280 BUILD_LIST 0 2022-11-23T02:11:24.9668579Z 282 CALL_METHOD 1 2022-11-23T02:11:24.9668935Z 284 POP_TOP 2022-11-23T02:11:24.9669215Z 286 JUMP_FORWARD 3 (to 294) 2022-11-23T02:11:24.9669404Z 2022-11-23T02:11:24.9669523Z 1120 >> 288 LOAD_CONST 6 (False) 2022-11-23T02:11:24.9669824Z 290 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9670154Z 292 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:24.9670448Z >> 294 POP_BLOCK 2022-11-23T02:11:24.9670619Z 2022-11-23T02:11:24.9670754Z 1057 296 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9671020Z 298 DUP_TOP 2022-11-23T02:11:24.9671269Z 300 DUP_TOP 2022-11-23T02:11:24.9671520Z 302 CALL_FUNCTION 3 2022-11-23T02:11:24.9671782Z 304 POP_TOP 2022-11-23T02:11:24.9672062Z 306 JUMP_FORWARD 8 (to 324) 2022-11-23T02:11:24.9672328Z >> 308 WITH_EXCEPT_START 2022-11-23T02:11:24.9672627Z 310 POP_JUMP_IF_TRUE 157 (to 314) 2022-11-23T02:11:24.9672930Z 312 RERAISE 1 2022-11-23T02:11:24.9673175Z >> 314 POP_TOP 2022-11-23T02:11:24.9673414Z 316 POP_TOP 2022-11-23T02:11:24.9673653Z 318 POP_TOP 2022-11-23T02:11:24.9673885Z 320 POP_EXCEPT 2022-11-23T02:11:24.9674136Z 322 POP_TOP 2022-11-23T02:11:24.9674289Z 2022-11-23T02:11:24.9674487Z 1124 >> 324 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9674811Z 326 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:24.9675391Z 328 POP_JUMP_IF_FALSE 168 (to 336) 2022-11-23T02:11:24.9675702Z 330 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9676013Z 332 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9676308Z 334 POP_JUMP_IF_FALSE 178 (to 356) 2022-11-23T02:11:24.9676496Z 2022-11-23T02:11:24.9676629Z 1125 >> 336 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9676936Z 338 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9677118Z 2022-11-23T02:11:24.9677234Z 1124 340 EXTENDED_ARG 1 2022-11-23T02:11:24.9677537Z 342 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:24.9677725Z 2022-11-23T02:11:24.9677860Z 1125 344 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9678171Z 346 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9678458Z 348 LOAD_CONST 2 (1) 2022-11-23T02:11:24.9678752Z 350 COMPARE_OP 2 (==) 2022-11-23T02:11:24.9678933Z 2022-11-23T02:11:24.9679066Z 1124 352 EXTENDED_ARG 1 2022-11-23T02:11:24.9679349Z 354 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:24.9679533Z 2022-11-23T02:11:24.9679666Z 1128 >> 356 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9679965Z 358 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9680159Z 2022-11-23T02:11:24.9680275Z 1129 360 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9680584Z 362 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9680776Z 2022-11-23T02:11:24.9681060Z 1127 364 LOAD_CONST 8 (('static_graph', 'num_iterations')) 2022-11-23T02:11:24.9681408Z 366 BUILD_CONST_KEY_MAP 2 2022-11-23T02:11:24.9681701Z 368 STORE_FAST 6 (state_dict) 2022-11-23T02:11:24.9681888Z 2022-11-23T02:11:24.9682055Z 1136 370 LOAD_GLOBAL 32 (_tree_flatten_with_rref) 2022-11-23T02:11:24.9682375Z 372 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9682648Z 374 CALL_FUNCTION 1 2022-11-23T02:11:24.9682825Z 2022-11-23T02:11:24.9682961Z 1132 376 UNPACK_SEQUENCE 3 2022-11-23T02:11:24.9683135Z 2022-11-23T02:11:24.9683293Z 1133 378 STORE_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9683597Z 2022-11-23T02:11:24.9683749Z 1134 380 STORE_FAST 8 (treespec) 2022-11-23T02:11:24.9683934Z 2022-11-23T02:11:24.9684069Z 1135 382 STORE_FAST 9 (output_is_rref) 2022-11-23T02:11:24.9684260Z 2022-11-23T02:11:24.9684763Z 1137 384 LOAD_CONST 9 ( at 0x7f6052ac4660, file "/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1137>) 2022-11-23T02:11:24.9685426Z 386 LOAD_CONST 10 ('DistributedDataParallel.forward..') 2022-11-23T02:11:24.9685811Z 388 MAKE_FUNCTION 0 2022-11-23T02:11:24.9686091Z 390 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:24.9686385Z 392 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:24.9686698Z 394 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9686989Z 396 CALL_FUNCTION 1 2022-11-23T02:11:24.9687276Z 398 CALL_FUNCTION 1 2022-11-23T02:11:24.9687538Z 400 GET_ITER 2022-11-23T02:11:24.9687790Z 402 CALL_FUNCTION 1 2022-11-23T02:11:24.9688112Z 404 STORE_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9688315Z 2022-11-23T02:11:24.9688461Z 1140 406 LOAD_GLOBAL 35 (enumerate) 2022-11-23T02:11:24.9688880Z 408 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9689193Z 410 CALL_FUNCTION 1 2022-11-23T02:11:24.9689458Z 412 GET_ITER 2022-11-23T02:11:24.9689733Z >> 414 FOR_ITER 18 (to 452) 2022-11-23T02:11:24.9690052Z 416 UNPACK_SEQUENCE 2 2022-11-23T02:11:24.9690350Z 418 STORE_FAST 11 (i) 2022-11-23T02:11:24.9690647Z 420 STORE_FAST 5 (output) 2022-11-23T02:11:24.9690831Z 2022-11-23T02:11:24.9690952Z 1141 422 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9691264Z 424 LOAD_METHOD 36 (is_tensor) 2022-11-23T02:11:24.9691565Z 426 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9691855Z 428 CALL_METHOD 1 2022-11-23T02:11:24.9692138Z 430 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:24.9692437Z 432 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9692734Z 434 LOAD_ATTR 37 (grad_fn) 2022-11-23T02:11:24.9693011Z 436 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9693293Z 438 IS_OP 0 2022-11-23T02:11:24.9693583Z 440 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:24.9693770Z 2022-11-23T02:11:24.9693887Z 1142 442 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9694210Z 444 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9694525Z 446 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9694784Z 448 STORE_SUBSCR 2022-11-23T02:11:24.9695070Z >> 450 JUMP_ABSOLUTE 207 (to 414) 2022-11-23T02:11:24.9695256Z 2022-11-23T02:11:24.9695403Z 1149 >> 452 LOAD_GLOBAL 38 (_DDPSink) 2022-11-23T02:11:24.9695703Z 454 LOAD_ATTR 39 (apply) 2022-11-23T02:11:24.9695866Z 2022-11-23T02:11:24.9696003Z 1150 456 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9696302Z 458 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9696480Z 2022-11-23T02:11:24.9696623Z 1151 460 LOAD_FAST 6 (state_dict) 2022-11-23T02:11:24.9696811Z 2022-11-23T02:11:24.9696942Z 1149 462 BUILD_LIST 2 2022-11-23T02:11:24.9697096Z 2022-11-23T02:11:24.9697254Z 1152 464 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9697449Z 2022-11-23T02:11:24.9697578Z 1149 466 LIST_EXTEND 1 2022-11-23T02:11:24.9697844Z 468 LIST_TO_TUPLE 2022-11-23T02:11:24.9698183Z 470 CALL_FUNCTION_EX 0 2022-11-23T02:11:24.9698513Z 472 STORE_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:24.9698726Z 2022-11-23T02:11:24.9698867Z 1154 474 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:24.9699146Z 476 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:24.9699469Z 478 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9699783Z 480 CALL_FUNCTION 1 2022-11-23T02:11:24.9700066Z 482 CALL_FUNCTION 1 2022-11-23T02:11:24.9700306Z 484 GET_ITER 2022-11-23T02:11:24.9700577Z >> 486 FOR_ITER 15 (to 518) 2022-11-23T02:11:24.9700865Z 488 STORE_FAST 11 (i) 2022-11-23T02:11:24.9701024Z 2022-11-23T02:11:24.9701187Z 1155 490 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9701504Z 492 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9701775Z 494 BINARY_SUBSCR 2022-11-23T02:11:24.9702038Z 496 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9702316Z 498 IS_OP 0 2022-11-23T02:11:24.9702596Z 500 EXTENDED_ARG 1 2022-11-23T02:11:24.9702878Z 502 POP_JUMP_IF_FALSE 258 (to 516) 2022-11-23T02:11:24.9703068Z 2022-11-23T02:11:24.9703300Z 1156 504 LOAD_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:24.9703639Z 506 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9703905Z 508 BINARY_SUBSCR 2022-11-23T02:11:24.9704193Z 510 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9704504Z 512 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9704764Z 514 STORE_SUBSCR 2022-11-23T02:11:24.9705034Z >> 516 JUMP_ABSOLUTE 243 (to 486) 2022-11-23T02:11:24.9705217Z 2022-11-23T02:11:24.9705385Z 1159 >> 518 LOAD_GLOBAL 40 (_tree_unflatten_with_rref) 2022-11-23T02:11:24.9705598Z 2022-11-23T02:11:24.9705761Z 1160 520 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9706081Z 522 LOAD_FAST 8 (treespec) 2022-11-23T02:11:24.9706381Z 524 LOAD_FAST 9 (output_is_rref) 2022-11-23T02:11:24.9706576Z 2022-11-23T02:11:24.9706713Z 1159 526 CALL_FUNCTION 3 2022-11-23T02:11:24.9707007Z 528 STORE_FAST 5 (output) 2022-11-23T02:11:24.9707188Z 2022-11-23T02:11:24.9707306Z 1162 >> 530 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9707579Z 532 RETURN_VALUE 2022-11-23T02:11:24.9707814Z 2022-11-23T02:11:24.9707946Z 2022-11-23T02:11:24.9708413Z [2022-11-23 02:11:11,952] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:24.9709040Z [2022-11-23 02:11:11,952] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:24.9709777Z [2022-11-23 02:11:11,953] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR autograd [TorchVariable()] 2022-11-23T02:11:24.9710662Z [2022-11-23 02:11:11,953] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR profiler [TorchVariable()] 2022-11-23T02:11:24.9711603Z [2022-11-23 02:11:11,953] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR record_function [TorchVariable()] 2022-11-23T02:11:24.9712433Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1058 2022-11-23T02:11:24.9713267Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST DistributedDataParallel.forward [TorchVariable()] 2022-11-23T02:11:24.9714158Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:24.9714960Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(), ConstantVariable(str)] 2022-11-23T02:11:24.9715861Z [2022-11-23 02:11:11,954] torch._dynamo.variables.torch: [WARNING] Profiler will be ignored 2022-11-23T02:11:24.9716433Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE SETUP_WITH 308 [NullContextVariable()] 2022-11-23T02:11:24.9717107Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_TOP None [WithExitFunctionVariable(), ConstantVariable(NoneType)] 2022-11-23T02:11:24.9717852Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1060 2022-11-23T02:11:24.9718532Z [2022-11-23 02:11:11,954] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [WithExitFunctionVariable()] 2022-11-23T02:11:24.9719450Z [2022-11-23 02:11:11,955] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_grad_enabled [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:24.9720325Z [2022-11-23 02:11:11,955] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:24.9721068Z [2022-11-23 02:11:11,955] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:24.9721731Z [2022-11-23 02:11:11,955] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:24.9722530Z [2022-11-23 02:11:11,955] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR require_backward_grad_sync [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9723340Z [2022-11-23 02:11:11,956] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:24.9724093Z [2022-11-23 02:11:11,956] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1061 2022-11-23T02:11:24.9724768Z [2022-11-23 02:11:11,956] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:24.9725520Z [2022-11-23 02:11:11,956] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9726319Z [2022-11-23 02:11:11,957] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:24.9727117Z [2022-11-23 02:11:11,957] torch._dynamo.symbolic_convert: [DEBUG] TRACE IS_OP 1 [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger), ConstantVariable(NoneType)] 2022-11-23T02:11:24.9727886Z [2022-11-23 02:11:11,958] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 44 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:24.9728631Z [2022-11-23 02:11:11,958] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1062 2022-11-23T02:11:24.9729289Z [2022-11-23 02:11:11,958] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:24.9730055Z [2022-11-23 02:11:11,958] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9730988Z [2022-11-23 02:11:11,958] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR set_runtime_stats_and_log [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:24.9731786Z [2022-11-23 02:11:11,959] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), UserDefinedObjectVariable(instancemethod)] 2022-11-23T02:11:24.9732556Z [2022-11-23 02:11:11,960] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 432 2022-11-23T02:11:24.9733015Z 434 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:24.9733317Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:24.9733610Z 4 CALL_FUNCTION 1 2022-11-23T02:11:24.9733864Z 6 RETURN_VALUE 2022-11-23T02:11:24.9734029Z 2022-11-23T02:11:24.9734125Z 2022-11-23T02:11:24.9734705Z [2022-11-23 02:11:11,960] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 432 2022-11-23T02:11:24.9735148Z 432 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:24.9735446Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:24.9735625Z 2022-11-23T02:11:24.9735755Z 434 4 CALL_FUNCTION 1 2022-11-23T02:11:24.9736071Z 6 RETURN_VALUE 2022-11-23T02:11:24.9736248Z 2022-11-23T02:11:24.9736345Z 2022-11-23T02:11:24.9736721Z [2022-11-23 02:11:12,086] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:24.9737027Z - 2022-11-23T02:11:24.9737300Z local 'ddp_m' TYPE_MATCH 2022-11-23T02:11:24.9737557Z { 2022-11-23T02:11:24.9737879Z 'guard_types': ['TYPE_MATCH'], 2022-11-23T02:11:24.9738269Z 'code': ['___check_type_id(ddp_m, 94883284555664)'], 2022-11-23T02:11:24.9738794Z 'obj_weakref': 2022-11-23T02:11:24.9739411Z 'guarded_class': 2022-11-23T02:11:24.9739756Z } 2022-11-23T02:11:24.9739979Z 2022-11-23T02:11:24.9740215Z - 2022-11-23T02:11:24.9740497Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:24.9740757Z { 2022-11-23T02:11:24.9741087Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:24.9741398Z 'code': None, 2022-11-23T02:11:24.9741826Z 'obj_weakref': 2022-11-23T02:11:24.9742371Z 'guarded_class': 2022-11-23T02:11:24.9742711Z } 2022-11-23T02:11:24.9742913Z 2022-11-23T02:11:24.9743460Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping _call_impl /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py 2022-11-23T02:11:24.9744186Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping forward /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py 2022-11-23T02:11:24.9744875Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/site-packages/torch/autograd/profiler.py 2022-11-23T02:11:24.9745505Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping inner /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9746100Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping __getitem__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9746691Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping Optional /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9747268Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping __repr__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9747865Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping _type_check /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9748548Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping _type_convert /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9749139Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9749711Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping __eq__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9750294Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping __hash__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9750873Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping Union /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9751464Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9752063Z [2022-11-23 02:11:12,087] torch._dynamo.eval_frame: [DEBUG] skipping _remove_dups_flatten /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9752683Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping _deduplicate /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9753275Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9753906Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9754493Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping __setattr__ /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9755250Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping _is_dunder /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9755857Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping /opt/conda/lib/python3.10/typing.py 2022-11-23T02:11:24.9756601Z [2022-11-23 02:11:12,088] torch._dynamo.convert_frame: [DEBUG] skipping because no torch.* _collect_type_vars /opt/conda/lib/python3.10/site-packages/typing_extensions.py 123 2022-11-23T02:11:24.9757435Z [2022-11-23 02:11:12,088] torch._dynamo.convert_frame: [DEBUG] skipping because no torch.* _should_collect_from_parameters /opt/conda/lib/python3.10/site-packages/typing_extensions.py 111 2022-11-23T02:11:24.9758187Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping annotate /opt/conda/lib/python3.10/site-packages/torch/jit/__init__.py 2022-11-23T02:11:24.9758876Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping __enter__ /opt/conda/lib/python3.10/site-packages/torch/autograd/profiler.py 2022-11-23T02:11:24.9759540Z [2022-11-23 02:11:12,088] torch._dynamo.eval_frame: [DEBUG] skipping __call__ /opt/conda/lib/python3.10/site-packages/torch/_ops.py 2022-11-23T02:11:24.9760207Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping __setattr__ /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py 2022-11-23T02:11:24.9760914Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping __instancecheck__ /opt/conda/lib/python3.10/site-packages/torch/nn/parameter.py 2022-11-23T02:11:24.9761660Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping notify_join_context /opt/conda/lib/python3.10/site-packages/torch/distributed/algorithms/join.py 2022-11-23T02:11:24.9762380Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping __getattr__ /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py 2022-11-23T02:11:24.9763105Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping _check_sync_bufs_pre_fwd /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py 2022-11-23T02:11:24.9763872Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping will_sync_module_buffers /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py 2022-11-23T02:11:24.9764619Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping _run_ddp_forward /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py 2022-11-23T02:11:24.9765473Z [2022-11-23 02:11:12,089] torch._dynamo.eval_frame: [DEBUG] skipping _inside_ddp_forward /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py 2022-11-23T02:11:24.9766120Z [2022-11-23 02:11:12,091] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward 2022-11-23T02:11:24.9766783Z [2022-11-23 02:11:12,091] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:54 2022-11-23T02:11:24.9767378Z [2022-11-23 02:11:12,091] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:24.9767926Z [2022-11-23 02:11:12,091] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR net [NNModuleVariable()] 2022-11-23T02:11:24.9768492Z [2022-11-23 02:11:12,092] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [NNModuleVariable()] 2022-11-23T02:11:24.9769123Z [2022-11-23 02:11:12,092] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:24.9769728Z [2022-11-23 02:11:12,129] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:24.9770385Z [2022-11-23 02:11:12,129] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward 2022-11-23T02:11:24.9770956Z [2022-11-23 02:11:12,131] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function compile_fn 2022-11-23T02:11:24.9771633Z [2022-11-23 02:11:12,131] torch._dynamo.optimizations.distributed: [INFO] DDPOptimizer used bucket cap 26214400 and produced the following buckets: 2022-11-23T02:11:24.9772367Z [2022-11-23 02:11:12,132] torch._dynamo.optimizations.distributed: [INFO] Please `pip install tabulate` in order to pretty-print ddp bucket sizes 2022-11-23T02:11:24.9772965Z [2022-11-23 02:11:12,134] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:24.9773333Z ---orig graph--- 2022-11-23T02:11:24.9773579Z graph(): 2022-11-23T02:11:24.9773890Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:24.9774291Z %self_net_0 : [#users=1] = call_module[target=self_net_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:24.9774726Z %self_net_1 : [#users=1] = call_module[target=self_net_1](args = (%self_net_0,), kwargs = {}) 2022-11-23T02:11:24.9775153Z %self_net_2 : [#users=1] = call_module[target=self_net_2](args = (%self_net_1,), kwargs = {}) 2022-11-23T02:11:24.9775574Z %self_net_3 : [#users=1] = call_module[target=self_net_3](args = (%self_net_2,), kwargs = {}) 2022-11-23T02:11:24.9775979Z %self_net_4 : [#users=1] = call_module[target=self_net_4](args = (%self_net_3,), kwargs = {}) 2022-11-23T02:11:24.9776394Z %self_net_5 : [#users=1] = call_module[target=self_net_5](args = (%self_net_4,), kwargs = {}) 2022-11-23T02:11:24.9776805Z %self_net_6 : [#users=1] = call_module[target=self_net_6](args = (%self_net_5,), kwargs = {}) 2022-11-23T02:11:24.9777219Z %self_net_7 : [#users=1] = call_module[target=self_net_7](args = (%self_net_6,), kwargs = {}) 2022-11-23T02:11:24.9777539Z return (self_net_7,) 2022-11-23T02:11:24.9777706Z 2022-11-23T02:11:24.9777867Z ---split graph--- 2022-11-23T02:11:24.9778115Z graph(): 2022-11-23T02:11:24.9778416Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:24.9778826Z %submod_0 : [#users=1] = call_module[target=submod_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:24.9779246Z %submod_1 : [#users=1] = call_module[target=submod_1](args = (%submod_0,), kwargs = {}) 2022-11-23T02:11:24.9779645Z %submod_2 : [#users=1] = call_module[target=submod_2](args = (%submod_1,), kwargs = {}) 2022-11-23T02:11:24.9779968Z return (submod_2,) 2022-11-23T02:11:24.9780133Z 2022-11-23T02:11:24.9780290Z ---submod_0 graph--- 2022-11-23T02:11:24.9780536Z graph(): 2022-11-23T02:11:24.9780828Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:24.9781318Z %self_net_0 : [#users=1] = call_module[target=self_net_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:24.9781752Z %self_net_1 : [#users=1] = call_module[target=self_net_1](args = (%self_net_0,), kwargs = {}) 2022-11-23T02:11:24.9782067Z return self_net_1 2022-11-23T02:11:24.9782229Z 2022-11-23T02:11:24.9782389Z ---submod_1 graph--- 2022-11-23T02:11:24.9782642Z graph(): 2022-11-23T02:11:24.9782923Z %self_net_1 : [#users=1] = placeholder[target=self_net_1] 2022-11-23T02:11:24.9783329Z %self_net_2 : [#users=1] = call_module[target=self_net_2](args = (%self_net_1,), kwargs = {}) 2022-11-23T02:11:24.9783748Z %self_net_3 : [#users=1] = call_module[target=self_net_3](args = (%self_net_2,), kwargs = {}) 2022-11-23T02:11:24.9784075Z return self_net_3 2022-11-23T02:11:24.9784219Z 2022-11-23T02:11:24.9784376Z ---submod_2 graph--- 2022-11-23T02:11:24.9784616Z graph(): 2022-11-23T02:11:24.9784912Z %self_net_3 : [#users=1] = placeholder[target=self_net_3] 2022-11-23T02:11:24.9785296Z %self_net_4 : [#users=1] = call_module[target=self_net_4](args = (%self_net_3,), kwargs = {}) 2022-11-23T02:11:24.9785714Z %self_net_5 : [#users=1] = call_module[target=self_net_5](args = (%self_net_4,), kwargs = {}) 2022-11-23T02:11:24.9786126Z %self_net_6 : [#users=1] = call_module[target=self_net_6](args = (%self_net_5,), kwargs = {}) 2022-11-23T02:11:24.9786582Z %self_net_7 : [#users=1] = call_module[target=self_net_7](args = (%self_net_6,), kwargs = {}) 2022-11-23T02:11:24.9786931Z return self_net_7 2022-11-23T02:11:24.9787095Z 2022-11-23T02:11:24.9787243Z --------------- 2022-11-23T02:11:24.9787402Z 2022-11-23T02:11:24.9787792Z [2022-11-23 02:11:12,135] torch._dynamo.optimizations.distributed: [DEBUG] run_node placeholder, inputs got args tuple() 2022-11-23T02:11:24.9788485Z [2022-11-23 02:11:12,135] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_0 got args tuple(T[torch.Size([20, 10])]) 2022-11-23T02:11:24.9789070Z [2022-11-23 02:11:12,135] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:24.9789449Z ---submod_0 graph--- 2022-11-23T02:11:24.9789674Z graph(): 2022-11-23T02:11:24.9789985Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:24.9790454Z %self_net_0 : [#users=1] = call_module[target=self_net_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:24.9790889Z %self_net_1 : [#users=1] = call_module[target=self_net_1](args = (%self_net_0,), kwargs = {}) 2022-11-23T02:11:24.9791202Z return self_net_1 2022-11-23T02:11:24.9791779Z [2022-11-23 02:11:12,191] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_1 got args tuple(T[torch.Size([20, 5000])]) 2022-11-23T02:11:24.9792364Z [2022-11-23 02:11:12,191] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:24.9792725Z ---submod_1 graph--- 2022-11-23T02:11:24.9792973Z graph(): 2022-11-23T02:11:24.9793267Z %self_net_1 : [#users=1] = placeholder[target=self_net_1] 2022-11-23T02:11:24.9793656Z %self_net_2 : [#users=1] = call_module[target=self_net_2](args = (%self_net_1,), kwargs = {}) 2022-11-23T02:11:24.9794078Z %self_net_3 : [#users=1] = call_module[target=self_net_3](args = (%self_net_2,), kwargs = {}) 2022-11-23T02:11:24.9794411Z return self_net_3 2022-11-23T02:11:24.9794985Z [2022-11-23 02:11:12,242] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_2 got args tuple(T[torch.Size([20, 5000])]) 2022-11-23T02:11:24.9795806Z [2022-11-23 02:11:12,242] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:24.9796188Z ---submod_2 graph--- 2022-11-23T02:11:24.9796438Z graph(): 2022-11-23T02:11:24.9796720Z %self_net_3 : [#users=1] = placeholder[target=self_net_3] 2022-11-23T02:11:24.9797117Z %self_net_4 : [#users=1] = call_module[target=self_net_4](args = (%self_net_3,), kwargs = {}) 2022-11-23T02:11:24.9797536Z %self_net_5 : [#users=1] = call_module[target=self_net_5](args = (%self_net_4,), kwargs = {}) 2022-11-23T02:11:24.9798067Z %self_net_6 : [#users=1] = call_module[target=self_net_6](args = (%self_net_5,), kwargs = {}) 2022-11-23T02:11:24.9798468Z %self_net_7 : [#users=1] = call_module[target=self_net_7](args = (%self_net_6,), kwargs = {}) 2022-11-23T02:11:24.9798801Z return self_net_7 2022-11-23T02:11:24.9799379Z [2022-11-23 02:11:12,334] torch._dynamo.optimizations.distributed: [DEBUG] run_node output, output got args tuple(tuple(T[torch.Size([20, 5])])) 2022-11-23T02:11:24.9799951Z [2022-11-23 02:11:12,334] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:24.9800333Z ---final graph--- 2022-11-23T02:11:24.9800582Z graph(): 2022-11-23T02:11:24.9800880Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:24.9801305Z %submod_0 : [#users=1] = call_module[target=compiled_submod_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:24.9801763Z %submod_1 : [#users=1] = call_module[target=compiled_submod_1](args = (%submod_0,), kwargs = {}) 2022-11-23T02:11:24.9802218Z %submod_2 : [#users=1] = call_module[target=compiled_submod_2](args = (%submod_1,), kwargs = {}) 2022-11-23T02:11:24.9802547Z return (submod_2,) 2022-11-23T02:11:24.9802831Z --------------- 2022-11-23T02:11:24.9802989Z 2022-11-23T02:11:24.9803326Z [2022-11-23 02:11:12,335] torch._dynamo.output_graph: [INFO] Step 2: done compiler function compile_fn 2022-11-23T02:11:24.9803907Z [2022-11-23 02:11:12,335] torch._dynamo.output_graph: [CODE] TRACED GRAPH 2022-11-23T02:11:24.9804314Z __compiled_fn_0 .1 opcode, name, target, args, kwargs 2022-11-23T02:11:24.9804662Z placeholder, inputs, inputs, (), {} 2022-11-23T02:11:24.9804992Z call_module, self_net_0, self_net_0, (inputs,), {} 2022-11-23T02:11:24.9805314Z call_module, self_net_1, self_net_1, (self_net_0,), {} 2022-11-23T02:11:24.9805652Z call_module, self_net_2, self_net_2, (self_net_1,), {} 2022-11-23T02:11:24.9805983Z call_module, self_net_3, self_net_3, (self_net_2,), {} 2022-11-23T02:11:24.9806305Z call_module, self_net_4, self_net_4, (self_net_3,), {} 2022-11-23T02:11:24.9806634Z call_module, self_net_5, self_net_5, (self_net_4,), {} 2022-11-23T02:11:24.9806960Z call_module, self_net_6, self_net_6, (self_net_5,), {} 2022-11-23T02:11:24.9807270Z call_module, self_net_7, self_net_7, (self_net_6,), {} 2022-11-23T02:11:24.9807601Z output, output, output, ((self_net_7,),), {} 2022-11-23T02:11:24.9807802Z 2022-11-23T02:11:24.9808280Z [2022-11-23 02:11:12,335] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:24.9808746Z 54 0 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9809022Z 2 LOAD_METHOD 0 (net) 2022-11-23T02:11:24.9809312Z 4 LOAD_FAST 1 (inputs) 2022-11-23T02:11:24.9809600Z 6 CALL_METHOD 1 2022-11-23T02:11:24.9809852Z 8 RETURN_VALUE 2022-11-23T02:11:24.9810021Z 2022-11-23T02:11:24.9810116Z 2022-11-23T02:11:24.9810699Z [2022-11-23 02:11:12,335] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:24.9811176Z 53 0 LOAD_GLOBAL 1 (__compiled_fn_0) 2022-11-23T02:11:24.9811471Z 2 LOAD_FAST 1 (inputs) 2022-11-23T02:11:24.9811765Z 4 CALL_FUNCTION 1 2022-11-23T02:11:24.9812060Z 6 UNPACK_SEQUENCE 1 2022-11-23T02:11:24.9812312Z 8 RETURN_VALUE 2022-11-23T02:11:24.9812478Z 2022-11-23T02:11:24.9812573Z 2022-11-23T02:11:24.9812943Z [2022-11-23 02:11:12,336] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:24.9813234Z - 2022-11-23T02:11:24.9813523Z local 'self' NN_MODULE 2022-11-23T02:11:24.9813775Z { 2022-11-23T02:11:24.9814078Z 'guard_types': ['ID_MATCH'], 2022-11-23T02:11:24.9814472Z 'code': ['___check_obj_id(self, 140051302751088)'], 2022-11-23T02:11:24.9815028Z 'obj_weakref': 2022-11-23T02:11:24.9815558Z 'guarded_class': 2022-11-23T02:11:24.9815861Z } 2022-11-23T02:11:24.9816090Z 2022-11-23T02:11:24.9816326Z - 2022-11-23T02:11:24.9816614Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:24.9816874Z { 2022-11-23T02:11:24.9817203Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:24.9817509Z 'code': None, 2022-11-23T02:11:24.9817942Z 'obj_weakref': 2022-11-23T02:11:24.9818494Z 'guarded_class': 2022-11-23T02:11:24.9818807Z } 2022-11-23T02:11:24.9819034Z 2022-11-23T02:11:24.9819273Z - 2022-11-23T02:11:24.9819576Z local_nn_module 'self.net' NN_MODULE 2022-11-23T02:11:24.9819847Z { 2022-11-23T02:11:24.9820141Z 'guard_types': None, 2022-11-23T02:11:24.9820435Z 'code': None, 2022-11-23T02:11:24.9820745Z 'obj_weakref': None 2022-11-23T02:11:24.9821068Z 'guarded_class': None 2022-11-23T02:11:24.9821370Z } 2022-11-23T02:11:24.9821608Z 2022-11-23T02:11:24.9821845Z - 2022-11-23T02:11:24.9822156Z local_nn_module 'self.net[0]' NN_MODULE 2022-11-23T02:11:24.9822431Z { 2022-11-23T02:11:24.9822727Z 'guard_types': None, 2022-11-23T02:11:24.9823016Z 'code': None, 2022-11-23T02:11:24.9823325Z 'obj_weakref': None 2022-11-23T02:11:24.9823648Z 'guarded_class': None 2022-11-23T02:11:24.9823901Z } 2022-11-23T02:11:24.9824103Z 2022-11-23T02:11:24.9824341Z - 2022-11-23T02:11:24.9824671Z local_nn_module 'self.net[1]' NN_MODULE 2022-11-23T02:11:24.9824925Z { 2022-11-23T02:11:24.9825217Z 'guard_types': None, 2022-11-23T02:11:24.9825521Z 'code': None, 2022-11-23T02:11:24.9825812Z 'obj_weakref': None 2022-11-23T02:11:24.9826131Z 'guarded_class': None 2022-11-23T02:11:24.9826380Z } 2022-11-23T02:11:24.9826588Z 2022-11-23T02:11:24.9826821Z - 2022-11-23T02:11:24.9827145Z local_nn_module 'self.net[2]' NN_MODULE 2022-11-23T02:11:24.9827399Z { 2022-11-23T02:11:24.9827690Z 'guard_types': None, 2022-11-23T02:11:24.9828003Z 'code': None, 2022-11-23T02:11:24.9828293Z 'obj_weakref': None 2022-11-23T02:11:24.9828613Z 'guarded_class': None 2022-11-23T02:11:24.9828864Z } 2022-11-23T02:11:24.9829066Z 2022-11-23T02:11:24.9829298Z - 2022-11-23T02:11:24.9829628Z local_nn_module 'self.net[3]' NN_MODULE 2022-11-23T02:11:24.9829882Z { 2022-11-23T02:11:24.9830178Z 'guard_types': None, 2022-11-23T02:11:24.9830483Z 'code': None, 2022-11-23T02:11:24.9830775Z 'obj_weakref': None 2022-11-23T02:11:24.9831104Z 'guarded_class': None 2022-11-23T02:11:24.9831356Z } 2022-11-23T02:11:24.9831564Z 2022-11-23T02:11:24.9831805Z - 2022-11-23T02:11:24.9832132Z local_nn_module 'self.net[4]' NN_MODULE 2022-11-23T02:11:24.9832384Z { 2022-11-23T02:11:24.9832681Z 'guard_types': None, 2022-11-23T02:11:24.9832983Z 'code': None, 2022-11-23T02:11:24.9833277Z 'obj_weakref': None 2022-11-23T02:11:24.9833601Z 'guarded_class': None 2022-11-23T02:11:24.9833848Z } 2022-11-23T02:11:24.9834054Z 2022-11-23T02:11:24.9834286Z - 2022-11-23T02:11:24.9834610Z local_nn_module 'self.net[5]' NN_MODULE 2022-11-23T02:11:24.9834941Z { 2022-11-23T02:11:24.9835602Z 'guard_types': None, 2022-11-23T02:11:24.9835913Z 'code': None, 2022-11-23T02:11:24.9836206Z 'obj_weakref': None 2022-11-23T02:11:24.9836530Z 'guarded_class': None 2022-11-23T02:11:24.9836779Z } 2022-11-23T02:11:24.9836998Z 2022-11-23T02:11:24.9837218Z - 2022-11-23T02:11:24.9837549Z local_nn_module 'self.net[6]' NN_MODULE 2022-11-23T02:11:24.9837824Z { 2022-11-23T02:11:24.9838102Z 'guard_types': None, 2022-11-23T02:11:24.9838406Z 'code': None, 2022-11-23T02:11:24.9838713Z 'obj_weakref': None 2022-11-23T02:11:24.9839018Z 'guarded_class': None 2022-11-23T02:11:24.9839266Z } 2022-11-23T02:11:24.9839485Z 2022-11-23T02:11:24.9839699Z - 2022-11-23T02:11:24.9840023Z local_nn_module 'self.net[7]' NN_MODULE 2022-11-23T02:11:24.9840298Z { 2022-11-23T02:11:24.9840576Z 'guard_types': None, 2022-11-23T02:11:24.9840886Z 'code': None, 2022-11-23T02:11:24.9841195Z 'obj_weakref': None 2022-11-23T02:11:24.9841499Z 'guarded_class': None 2022-11-23T02:11:24.9841754Z } 2022-11-23T02:11:24.9841972Z 2022-11-23T02:11:24.9842596Z [2022-11-23 02:11:12,336] torch._dynamo.eval_frame: [DEBUG] skipping forward /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9843327Z [2022-11-23 02:11:12,336] torch._dynamo.eval_frame: [DEBUG] skipping __getattr__ /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9844025Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping __call__ /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9844731Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping innermost_fn /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9845366Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping wraps /opt/conda/lib/python3.10/functools.py 2022-11-23T02:11:24.9845983Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping update_wrapper /opt/conda/lib/python3.10/functools.py 2022-11-23T02:11:24.9846648Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping _fn /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9847329Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping nothing /opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py 2022-11-23T02:11:24.9847947Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping __exit__ /opt/conda/lib/python3.10/contextlib.py 2022-11-23T02:11:24.9848544Z [2022-11-23 02:11:12,337] torch._dynamo.eval_frame: [DEBUG] skipping __exit__ /opt/conda/lib/python3.10/contextlib.py 2022-11-23T02:11:24.9849247Z [2022-11-23 02:11:12,338] torch._dynamo.eval_frame: [DEBUG] skipping _check_sync_bufs_post_fwd /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py 2022-11-23T02:11:24.9849970Z [2022-11-23 02:11:12,338] torch._dynamo.eval_frame: [DEBUG] skipping __exit__ /opt/conda/lib/python3.10/site-packages/torch/autograd/profiler.py 2022-11-23T02:11:24.9850647Z [2022-11-23 02:11:12,338] torch._dynamo.eval_frame: [DEBUG] skipping is_scripting /opt/conda/lib/python3.10/site-packages/torch/_jit_internal.py 2022-11-23T02:11:24.9851325Z [2022-11-23 02:11:12,338] torch._dynamo.eval_frame: [DEBUG] skipping __getattr__ /opt/conda/lib/python3.10/site-packages/torch/_ops.py 2022-11-23T02:11:24.9851973Z [2022-11-23 02:11:12,338] torch._dynamo.eval_frame: [DEBUG] skipping __init__ /opt/conda/lib/python3.10/site-packages/torch/_ops.py 2022-11-23T02:11:24.9852630Z [2022-11-23 02:11:12,338] torch._dynamo.eval_frame: [DEBUG] skipping __call__ /opt/conda/lib/python3.10/site-packages/torch/_ops.py 2022-11-23T02:11:24.9853110Z ok (2.914s) 2022-11-23T02:11:24.9853406Z test_custom_layer (__main__.TestDistributed) 2022-11-23T02:11:24.9854073Z Just ensures that the appropriate number of splits happen (based on ... [2022-11-23 02:11:12,374] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing opt_fn 2022-11-23T02:11:24.9854843Z [2022-11-23 02:11:12,375] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:455 2022-11-23T02:11:24.9855435Z [2022-11-23 02:11:12,375] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF ddp_m [] 2022-11-23T02:11:24.9856107Z [2022-11-23 02:11:12,375] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9856898Z [2022-11-23 02:11:12,375] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UnspecializedNNModuleVariable(DistributedDataParallel), TensorVariable()] 2022-11-23T02:11:24.9857785Z [2022-11-23 02:11:12,377] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:24.9858261Z 1057 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9858567Z 2 LOAD_ATTR 1 (autograd) 2022-11-23T02:11:24.9858931Z 4 LOAD_ATTR 2 (profiler) 2022-11-23T02:11:24.9859251Z 6 LOAD_METHOD 3 (record_function) 2022-11-23T02:11:24.9859450Z 2022-11-23T02:11:24.9859747Z 1058 8 LOAD_CONST 1 ('DistributedDataParallel.forward') 2022-11-23T02:11:24.9859996Z 2022-11-23T02:11:24.9860131Z 1057 10 CALL_METHOD 1 2022-11-23T02:11:24.9860425Z 12 SETUP_WITH 147 (to 308) 2022-11-23T02:11:24.9860678Z 14 POP_TOP 2022-11-23T02:11:24.9860839Z 2022-11-23T02:11:24.9860979Z 1060 16 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9861295Z 18 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:24.9861580Z 20 CALL_METHOD 0 2022-11-23T02:11:24.9861876Z 22 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:24.9862175Z 24 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9862508Z 26 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:24.9862823Z 28 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:24.9863008Z 2022-11-23T02:11:24.9863142Z 1061 30 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9863434Z 32 LOAD_ATTR 6 (logger) 2022-11-23T02:11:24.9863708Z 34 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9863995Z 36 IS_OP 1 2022-11-23T02:11:24.9864287Z 38 POP_JUMP_IF_TRUE 22 (to 44) 2022-11-23T02:11:24.9864562Z 40 LOAD_ASSERTION_ERROR 2022-11-23T02:11:24.9864855Z 42 RAISE_VARARGS 1 2022-11-23T02:11:24.9865030Z 2022-11-23T02:11:24.9865164Z 1062 >> 44 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9865453Z 46 LOAD_ATTR 6 (logger) 2022-11-23T02:11:24.9865762Z 48 LOAD_METHOD 7 (set_runtime_stats_and_log) 2022-11-23T02:11:24.9866077Z 50 CALL_METHOD 0 2022-11-23T02:11:24.9866343Z 52 POP_TOP 2022-11-23T02:11:24.9866486Z 2022-11-23T02:11:24.9866622Z 1063 54 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9866884Z 56 DUP_TOP 2022-11-23T02:11:24.9867175Z 58 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9867463Z 60 LOAD_CONST 2 (1) 2022-11-23T02:11:24.9867733Z 62 INPLACE_ADD 2022-11-23T02:11:24.9867979Z 64 ROT_TWO 2022-11-23T02:11:24.9868242Z 66 STORE_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9868510Z 2022-11-23T02:11:24.9868648Z 1064 68 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9868948Z 70 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9869274Z 72 LOAD_METHOD 10 (prepare_for_forward) 2022-11-23T02:11:24.9869575Z 74 CALL_METHOD 0 2022-11-23T02:11:24.9869838Z 76 POP_TOP 2022-11-23T02:11:24.9869994Z 2022-11-23T02:11:24.9870135Z 1068 >> 78 LOAD_GLOBAL 11 (Join) 2022-11-23T02:11:24.9870435Z 80 LOAD_METHOD 12 (notify_join_context) 2022-11-23T02:11:24.9870748Z 82 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9871028Z 84 CALL_METHOD 1 2022-11-23T02:11:24.9871299Z 86 STORE_FAST 3 (work) 2022-11-23T02:11:24.9871479Z 2022-11-23T02:11:24.9871615Z 1069 88 LOAD_FAST 3 (work) 2022-11-23T02:11:24.9872944Z 90 POP_JUMP_IF_FALSE 54 (to 108) 2022-11-23T02:11:24.9873146Z 2022-11-23T02:11:24.9873284Z 1070 92 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9873568Z 94 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9873909Z 96 LOAD_METHOD 13 (_set_forward_pass_work_handle) 2022-11-23T02:11:24.9874125Z 2022-11-23T02:11:24.9874260Z 1071 98 LOAD_FAST 3 (work) 2022-11-23T02:11:24.9874596Z 100 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9874942Z 102 LOAD_ATTR 14 (_divide_by_initial_world_size) 2022-11-23T02:11:24.9875325Z 2022-11-23T02:11:24.9875464Z 1070 104 CALL_METHOD 2 2022-11-23T02:11:24.9875735Z 106 POP_TOP 2022-11-23T02:11:24.9875873Z 2022-11-23T02:11:24.9876015Z 1080 >> 108 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9876331Z 110 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:24.9876637Z 112 CALL_METHOD 0 2022-11-23T02:11:24.9876929Z 114 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:24.9877233Z 116 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9877530Z 118 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9877830Z 120 LOAD_METHOD 15 (_rebuild_buckets) 2022-11-23T02:11:24.9878139Z 122 CALL_METHOD 0 2022-11-23T02:11:24.9878450Z 124 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:24.9878639Z 2022-11-23T02:11:24.9878780Z 1081 126 LOAD_GLOBAL 6 (logger) 2022-11-23T02:11:24.9879059Z 128 LOAD_METHOD 16 (info) 2022-11-23T02:11:24.9879239Z 2022-11-23T02:11:24.9879570Z 1082 130 LOAD_CONST 3 ('Reducer buckets have been rebuilt in this iteration.') 2022-11-23T02:11:24.9879819Z 2022-11-23T02:11:24.9879953Z 1081 132 CALL_METHOD 1 2022-11-23T02:11:24.9880206Z 134 POP_TOP 2022-11-23T02:11:24.9880371Z 2022-11-23T02:11:24.9880514Z 1084 136 LOAD_CONST 4 (True) 2022-11-23T02:11:24.9880810Z 138 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9881131Z 140 STORE_ATTR 17 (_has_rebuilt_buckets) 2022-11-23T02:11:24.9881315Z 2022-11-23T02:11:24.9881460Z 1088 >> 142 LOAD_GLOBAL 18 (hasattr) 2022-11-23T02:11:24.9881763Z 144 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9882145Z 146 LOAD_CONST 5 ('buffer_hook') 2022-11-23T02:11:24.9882434Z 148 CALL_FUNCTION 2 2022-11-23T02:11:24.9882764Z 150 STORE_FAST 4 (buffer_hook_registered) 2022-11-23T02:11:24.9882975Z 2022-11-23T02:11:24.9883112Z 1089 152 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9883437Z 154 LOAD_METHOD 19 (_check_sync_bufs_pre_fwd) 2022-11-23T02:11:24.9883740Z 156 CALL_METHOD 0 2022-11-23T02:11:24.9884149Z 158 POP_JUMP_IF_FALSE 84 (to 168) 2022-11-23T02:11:24.9884339Z 2022-11-23T02:11:24.9884476Z 1090 160 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9884769Z 162 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:24.9885071Z 164 CALL_METHOD 0 2022-11-23T02:11:24.9885335Z 166 POP_TOP 2022-11-23T02:11:24.9885494Z 2022-11-23T02:11:24.9885616Z 1092 >> 168 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9885918Z 170 LOAD_ATTR 21 (_join_config) 2022-11-23T02:11:24.9886223Z 172 LOAD_ATTR 22 (enable) 2022-11-23T02:11:24.9886532Z 174 POP_JUMP_IF_FALSE 94 (to 188) 2022-11-23T02:11:24.9886704Z 2022-11-23T02:11:24.9886841Z 1094 176 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9887191Z 178 LOAD_ATTR 23 (_check_global_requires_backward_grad_sync) 2022-11-23T02:11:24.9887422Z 2022-11-23T02:11:24.9887567Z 1095 180 LOAD_CONST 6 (False) 2022-11-23T02:11:24.9887750Z 2022-11-23T02:11:24.9887975Z 1094 182 LOAD_CONST 7 (('is_joined_rank',)) 2022-11-23T02:11:24.9888293Z 184 CALL_FUNCTION_KW 1 2022-11-23T02:11:24.9888562Z 186 POP_TOP 2022-11-23T02:11:24.9888722Z 2022-11-23T02:11:24.9888857Z 1098 >> 188 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9889228Z 190 LOAD_ATTR 24 (_run_ddp_forward) 2022-11-23T02:11:24.9889557Z 192 LOAD_FAST 1 (inputs) 2022-11-23T02:11:24.9889853Z 194 BUILD_MAP 0 2022-11-23T02:11:24.9890163Z 196 LOAD_FAST 2 (kwargs) 2022-11-23T02:11:24.9890455Z 198 DICT_MERGE 1 2022-11-23T02:11:24.9890746Z 200 CALL_FUNCTION_EX 1 2022-11-23T02:11:24.9891026Z 202 STORE_FAST 5 (output) 2022-11-23T02:11:24.9891216Z 2022-11-23T02:11:24.9891351Z 1102 204 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9891680Z 206 LOAD_METHOD 25 (_check_sync_bufs_post_fwd) 2022-11-23T02:11:24.9892004Z 208 CALL_METHOD 0 2022-11-23T02:11:24.9892290Z 210 POP_JUMP_IF_FALSE 110 (to 220) 2022-11-23T02:11:24.9892480Z 2022-11-23T02:11:24.9892622Z 1103 212 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9892933Z 214 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:24.9893216Z 216 CALL_METHOD 0 2022-11-23T02:11:24.9893483Z 218 POP_TOP 2022-11-23T02:11:24.9893642Z 2022-11-23T02:11:24.9893783Z 1105 >> 220 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9894081Z 222 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:24.9894387Z 224 CALL_METHOD 0 2022-11-23T02:11:24.9894690Z 226 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:24.9894999Z 228 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9895353Z 230 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:24.9895691Z 232 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:24.9895879Z 2022-11-23T02:11:24.9896017Z 1106 234 LOAD_CONST 4 (True) 2022-11-23T02:11:24.9896299Z 236 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9896631Z 238 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:24.9896847Z 2022-11-23T02:11:24.9896983Z 1112 240 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9897291Z 242 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:24.9897625Z 244 POP_JUMP_IF_FALSE 137 (to 274) 2022-11-23T02:11:24.9897929Z 246 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9898234Z 248 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9898606Z 250 POP_JUMP_IF_TRUE 137 (to 274) 2022-11-23T02:11:24.9898794Z 2022-11-23T02:11:24.9898929Z 1114 252 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9899233Z 254 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9899546Z 256 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:24.9899753Z 2022-11-23T02:11:24.9899896Z 1115 258 LOAD_GLOBAL 30 (list) 2022-11-23T02:11:24.9900205Z 260 LOAD_GLOBAL 31 (_find_tensors) 2022-11-23T02:11:24.9900515Z 262 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9900792Z 264 CALL_FUNCTION 1 2022-11-23T02:11:24.9901085Z 266 CALL_FUNCTION 1 2022-11-23T02:11:24.9901260Z 2022-11-23T02:11:24.9901392Z 1114 268 CALL_METHOD 1 2022-11-23T02:11:24.9901642Z 270 POP_TOP 2022-11-23T02:11:24.9901923Z 272 JUMP_FORWARD 10 (to 294) 2022-11-23T02:11:24.9902112Z 2022-11-23T02:11:24.9902250Z 1118 >> 274 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9902531Z 276 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9902860Z 278 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:24.9903179Z 280 BUILD_LIST 0 2022-11-23T02:11:24.9903445Z 282 CALL_METHOD 1 2022-11-23T02:11:24.9903766Z 284 POP_TOP 2022-11-23T02:11:24.9904063Z 286 JUMP_FORWARD 3 (to 294) 2022-11-23T02:11:24.9904246Z 2022-11-23T02:11:24.9904386Z 1120 >> 288 LOAD_CONST 6 (False) 2022-11-23T02:11:24.9904663Z 290 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9904995Z 292 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:24.9905302Z >> 294 POP_BLOCK 2022-11-23T02:11:24.9905467Z 2022-11-23T02:11:24.9905585Z 1057 296 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9905858Z 298 DUP_TOP 2022-11-23T02:11:24.9906103Z 300 DUP_TOP 2022-11-23T02:11:24.9906355Z 302 CALL_FUNCTION 3 2022-11-23T02:11:24.9906621Z 304 POP_TOP 2022-11-23T02:11:24.9906900Z 306 JUMP_FORWARD 8 (to 324) 2022-11-23T02:11:24.9907167Z >> 308 WITH_EXCEPT_START 2022-11-23T02:11:24.9907467Z 310 POP_JUMP_IF_TRUE 157 (to 314) 2022-11-23T02:11:24.9907767Z 312 RERAISE 1 2022-11-23T02:11:24.9908025Z >> 314 POP_TOP 2022-11-23T02:11:24.9908251Z 316 POP_TOP 2022-11-23T02:11:24.9908494Z 318 POP_TOP 2022-11-23T02:11:24.9908745Z 320 POP_EXCEPT 2022-11-23T02:11:24.9908978Z 322 POP_TOP 2022-11-23T02:11:24.9909134Z 2022-11-23T02:11:24.9909270Z 1124 >> 324 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9909597Z 326 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:24.9909918Z 328 POP_JUMP_IF_FALSE 168 (to 336) 2022-11-23T02:11:24.9910220Z 330 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9910524Z 332 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9910822Z 334 POP_JUMP_IF_FALSE 178 (to 356) 2022-11-23T02:11:24.9911010Z 2022-11-23T02:11:24.9911145Z 1125 >> 336 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9911454Z 338 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9911643Z 2022-11-23T02:11:24.9911778Z 1124 340 EXTENDED_ARG 1 2022-11-23T02:11:24.9912061Z 342 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:24.9912249Z 2022-11-23T02:11:24.9912385Z 1125 344 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9912697Z 346 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9912984Z 348 LOAD_CONST 2 (1) 2022-11-23T02:11:24.9913278Z 350 COMPARE_OP 2 (==) 2022-11-23T02:11:24.9913531Z 2022-11-23T02:11:24.9913667Z 1124 352 EXTENDED_ARG 1 2022-11-23T02:11:24.9913971Z 354 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:24.9914139Z 2022-11-23T02:11:24.9914276Z 1128 >> 356 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9914581Z 358 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:24.9914774Z 2022-11-23T02:11:24.9914912Z 1129 360 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9915423Z 362 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:24.9915615Z 2022-11-23T02:11:24.9915898Z 1127 364 LOAD_CONST 8 (('static_graph', 'num_iterations')) 2022-11-23T02:11:24.9916243Z 366 BUILD_CONST_KEY_MAP 2 2022-11-23T02:11:24.9916558Z 368 STORE_FAST 6 (state_dict) 2022-11-23T02:11:24.9916726Z 2022-11-23T02:11:24.9916897Z 1136 370 LOAD_GLOBAL 32 (_tree_flatten_with_rref) 2022-11-23T02:11:24.9917231Z 372 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9917526Z 374 CALL_FUNCTION 1 2022-11-23T02:11:24.9917705Z 2022-11-23T02:11:24.9917824Z 1132 376 UNPACK_SEQUENCE 3 2022-11-23T02:11:24.9918004Z 2022-11-23T02:11:24.9918164Z 1133 378 STORE_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9918467Z 2022-11-23T02:11:24.9918625Z 1134 380 STORE_FAST 8 (treespec) 2022-11-23T02:11:24.9918813Z 2022-11-23T02:11:24.9918967Z 1135 382 STORE_FAST 9 (output_is_rref) 2022-11-23T02:11:24.9919139Z 2022-11-23T02:11:24.9919641Z 1137 384 LOAD_CONST 9 ( at 0x7f6052ac4660, file "/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1137>) 2022-11-23T02:11:24.9920296Z 386 LOAD_CONST 10 ('DistributedDataParallel.forward..') 2022-11-23T02:11:24.9920689Z 388 MAKE_FUNCTION 0 2022-11-23T02:11:24.9920986Z 390 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:24.9921272Z 392 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:24.9921593Z 394 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9921908Z 396 CALL_FUNCTION 1 2022-11-23T02:11:24.9922186Z 398 CALL_FUNCTION 1 2022-11-23T02:11:24.9922454Z 400 GET_ITER 2022-11-23T02:11:24.9922722Z 402 CALL_FUNCTION 1 2022-11-23T02:11:24.9923025Z 404 STORE_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9923232Z 2022-11-23T02:11:24.9923378Z 1140 406 LOAD_GLOBAL 35 (enumerate) 2022-11-23T02:11:24.9923703Z 408 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9924002Z 410 CALL_FUNCTION 1 2022-11-23T02:11:24.9924269Z 412 GET_ITER 2022-11-23T02:11:24.9924553Z >> 414 FOR_ITER 18 (to 452) 2022-11-23T02:11:24.9924850Z 416 UNPACK_SEQUENCE 2 2022-11-23T02:11:24.9925127Z 418 STORE_FAST 11 (i) 2022-11-23T02:11:24.9925427Z 420 STORE_FAST 5 (output) 2022-11-23T02:11:24.9925611Z 2022-11-23T02:11:24.9925752Z 1141 422 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9926045Z 424 LOAD_METHOD 36 (is_tensor) 2022-11-23T02:11:24.9926351Z 426 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9926643Z 428 CALL_METHOD 1 2022-11-23T02:11:24.9926926Z 430 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:24.9927229Z 432 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9927524Z 434 LOAD_ATTR 37 (grad_fn) 2022-11-23T02:11:24.9927819Z 436 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9928091Z 438 IS_OP 0 2022-11-23T02:11:24.9928480Z 440 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:24.9928671Z 2022-11-23T02:11:24.9928809Z 1142 442 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9929111Z 444 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9929434Z 446 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9929716Z 448 STORE_SUBSCR 2022-11-23T02:11:24.9929990Z >> 450 JUMP_ABSOLUTE 207 (to 414) 2022-11-23T02:11:24.9930174Z 2022-11-23T02:11:24.9930320Z 1149 >> 452 LOAD_GLOBAL 38 (_DDPSink) 2022-11-23T02:11:24.9930623Z 454 LOAD_ATTR 39 (apply) 2022-11-23T02:11:24.9930805Z 2022-11-23T02:11:24.9930941Z 1150 456 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9931222Z 458 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:24.9931403Z 2022-11-23T02:11:24.9931550Z 1151 460 LOAD_FAST 6 (state_dict) 2022-11-23T02:11:24.9931738Z 2022-11-23T02:11:24.9931871Z 1149 462 BUILD_LIST 2 2022-11-23T02:11:24.9932045Z 2022-11-23T02:11:24.9932183Z 1152 464 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:24.9932385Z 2022-11-23T02:11:24.9932517Z 1149 466 LIST_EXTEND 1 2022-11-23T02:11:24.9932789Z 468 LIST_TO_TUPLE 2022-11-23T02:11:24.9933131Z 470 CALL_FUNCTION_EX 0 2022-11-23T02:11:24.9933459Z 472 STORE_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:24.9933674Z 2022-11-23T02:11:24.9933816Z 1154 474 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:24.9934117Z 476 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:24.9934423Z 478 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9934741Z 480 CALL_FUNCTION 1 2022-11-23T02:11:24.9935030Z 482 CALL_FUNCTION 1 2022-11-23T02:11:24.9935282Z 484 GET_ITER 2022-11-23T02:11:24.9935559Z >> 486 FOR_ITER 15 (to 518) 2022-11-23T02:11:24.9935852Z 488 STORE_FAST 11 (i) 2022-11-23T02:11:24.9936031Z 2022-11-23T02:11:24.9936196Z 1155 490 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9936494Z 492 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9936772Z 494 BINARY_SUBSCR 2022-11-23T02:11:24.9937054Z 496 LOAD_CONST 0 (None) 2022-11-23T02:11:24.9937324Z 498 IS_OP 0 2022-11-23T02:11:24.9937608Z 500 EXTENDED_ARG 1 2022-11-23T02:11:24.9937909Z 502 POP_JUMP_IF_FALSE 258 (to 516) 2022-11-23T02:11:24.9938099Z 2022-11-23T02:11:24.9938253Z 1156 504 LOAD_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:24.9938579Z 506 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9938853Z 508 BINARY_SUBSCR 2022-11-23T02:11:24.9939149Z 510 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9939465Z 512 LOAD_FAST 11 (i) 2022-11-23T02:11:24.9939737Z 514 STORE_SUBSCR 2022-11-23T02:11:24.9940030Z >> 516 JUMP_ABSOLUTE 243 (to 486) 2022-11-23T02:11:24.9940198Z 2022-11-23T02:11:24.9940375Z 1159 >> 518 LOAD_GLOBAL 40 (_tree_unflatten_with_rref) 2022-11-23T02:11:24.9940586Z 2022-11-23T02:11:24.9940748Z 1160 520 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:24.9941075Z 522 LOAD_FAST 8 (treespec) 2022-11-23T02:11:24.9941375Z 524 LOAD_FAST 9 (output_is_rref) 2022-11-23T02:11:24.9941569Z 2022-11-23T02:11:24.9941702Z 1159 526 CALL_FUNCTION 3 2022-11-23T02:11:24.9942001Z 528 STORE_FAST 5 (output) 2022-11-23T02:11:24.9942184Z 2022-11-23T02:11:24.9942322Z 1162 >> 530 LOAD_FAST 5 (output) 2022-11-23T02:11:24.9942655Z 532 RETURN_VALUE 2022-11-23T02:11:24.9942900Z 2022-11-23T02:11:24.9943035Z 2022-11-23T02:11:24.9943513Z [2022-11-23 02:11:12,378] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:24.9944125Z [2022-11-23 02:11:12,379] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:24.9944856Z [2022-11-23 02:11:12,379] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR autograd [TorchVariable()] 2022-11-23T02:11:24.9945738Z [2022-11-23 02:11:12,379] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR profiler [TorchVariable()] 2022-11-23T02:11:24.9946678Z [2022-11-23 02:11:12,379] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR record_function [TorchVariable()] 2022-11-23T02:11:24.9947531Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1058 2022-11-23T02:11:24.9948403Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST DistributedDataParallel.forward [TorchVariable()] 2022-11-23T02:11:24.9949240Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:24.9950048Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(), ConstantVariable(str)] 2022-11-23T02:11:24.9950705Z [2022-11-23 02:11:12,380] torch._dynamo.variables.torch: [WARNING] Profiler will be ignored 2022-11-23T02:11:24.9951282Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE SETUP_WITH 308 [NullContextVariable()] 2022-11-23T02:11:24.9951959Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_TOP None [WithExitFunctionVariable(), ConstantVariable(NoneType)] 2022-11-23T02:11:24.9952716Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1060 2022-11-23T02:11:24.9953394Z [2022-11-23 02:11:12,380] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [WithExitFunctionVariable()] 2022-11-23T02:11:24.9954220Z [2022-11-23 02:11:12,381] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_grad_enabled [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:24.9955239Z [2022-11-23 02:11:12,381] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:24.9956011Z [2022-11-23 02:11:12,381] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:24.9956683Z [2022-11-23 02:11:12,381] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:24.9957500Z [2022-11-23 02:11:12,381] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR require_backward_grad_sync [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9958302Z [2022-11-23 02:11:12,382] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:24.9959056Z [2022-11-23 02:11:12,382] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1061 2022-11-23T02:11:24.9959831Z [2022-11-23 02:11:12,382] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:24.9960601Z [2022-11-23 02:11:12,382] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9961392Z [2022-11-23 02:11:12,383] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:24.9962197Z [2022-11-23 02:11:12,383] torch._dynamo.symbolic_convert: [DEBUG] TRACE IS_OP 1 [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger), ConstantVariable(NoneType)] 2022-11-23T02:11:24.9962961Z [2022-11-23 02:11:12,383] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 44 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:24.9963711Z [2022-11-23 02:11:12,383] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1062 2022-11-23T02:11:24.9964369Z [2022-11-23 02:11:12,383] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:24.9965142Z [2022-11-23 02:11:12,383] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:24.9966058Z [2022-11-23 02:11:12,384] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR set_runtime_stats_and_log [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:24.9966874Z [2022-11-23 02:11:12,384] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), UserDefinedObjectVariable(instancemethod)] 2022-11-23T02:11:24.9967660Z [2022-11-23 02:11:12,385] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 453 2022-11-23T02:11:24.9968111Z 455 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:24.9968414Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:24.9968711Z 4 CALL_FUNCTION 1 2022-11-23T02:11:24.9968970Z 6 RETURN_VALUE 2022-11-23T02:11:24.9969137Z 2022-11-23T02:11:24.9969235Z 2022-11-23T02:11:24.9969823Z [2022-11-23 02:11:12,385] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 453 2022-11-23T02:11:24.9970286Z 453 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:24.9970569Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:24.9970750Z 2022-11-23T02:11:24.9970883Z 455 4 CALL_FUNCTION 1 2022-11-23T02:11:24.9971159Z 6 RETURN_VALUE 2022-11-23T02:11:24.9971323Z 2022-11-23T02:11:24.9971402Z 2022-11-23T02:11:24.9971777Z [2022-11-23 02:11:12,386] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:24.9972096Z - 2022-11-23T02:11:24.9972376Z local 'ddp_m' TYPE_MATCH 2022-11-23T02:11:24.9972641Z { 2022-11-23T02:11:24.9972968Z 'guard_types': ['TYPE_MATCH'], 2022-11-23T02:11:24.9973361Z 'code': ['___check_type_id(ddp_m, 94883284555664)'], 2022-11-23T02:11:24.9973904Z 'obj_weakref': 2022-11-23T02:11:24.9974521Z 'guarded_class': 2022-11-23T02:11:24.9974889Z } 2022-11-23T02:11:24.9975098Z 2022-11-23T02:11:24.9975339Z - 2022-11-23T02:11:24.9975642Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:24.9975890Z { 2022-11-23T02:11:24.9976223Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:24.9976549Z 'code': None, 2022-11-23T02:11:24.9976969Z 'obj_weakref': 2022-11-23T02:11:24.9977603Z 'guarded_class': 2022-11-23T02:11:24.9977944Z } 2022-11-23T02:11:24.9978154Z 2022-11-23T02:11:24.9978626Z [2022-11-23 02:11:12,388] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward 2022-11-23T02:11:24.9979320Z [2022-11-23 02:11:12,388] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:92 2022-11-23T02:11:24.9979928Z [2022-11-23 02:11:12,388] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:24.9980468Z [2022-11-23 02:11:12,388] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq [NNModuleVariable()] 2022-11-23T02:11:24.9981044Z [2022-11-23 02:11:12,388] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [NNModuleVariable()] 2022-11-23T02:11:24.9981674Z [2022-11-23 02:11:12,388] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:24.9982469Z [2022-11-23 02:11:12,389] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:24.9983005Z 78 0 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9983318Z 2 LOAD_METHOD 0 (linear) 2022-11-23T02:11:24.9983616Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:24.9983884Z 6 CALL_METHOD 1 2022-11-23T02:11:24.9984156Z 8 RETURN_VALUE 2022-11-23T02:11:24.9984398Z 2022-11-23T02:11:24.9984534Z 2022-11-23T02:11:24.9984978Z [2022-11-23 02:11:12,389] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:78 2022-11-23T02:11:24.9985564Z [2022-11-23 02:11:12,389] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:24.9986137Z [2022-11-23 02:11:12,389] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR linear [NNModuleVariable()] 2022-11-23T02:11:24.9986720Z [2022-11-23 02:11:12,390] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [NNModuleVariable()] 2022-11-23T02:11:24.9987322Z [2022-11-23 02:11:12,390] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:24.9987930Z [2022-11-23 02:11:12,396] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:24.9988714Z [2022-11-23 02:11:12,397] torch._dynamo.symbolic_convert: [DEBUG] DONE INLINING 2022-11-23T02:11:24.9989583Z [2022-11-23 02:11:12,399] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:24.9990077Z 70 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:24.9990403Z 2 LOAD_METHOD 1 (mm) 2022-11-23T02:11:24.9990699Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:24.9990992Z 6 LOAD_FAST 0 (self) 2022-11-23T02:11:24.9991277Z 8 LOAD_ATTR 2 (weight) 2022-11-23T02:11:24.9991576Z 10 LOAD_METHOD 3 (t) 2022-11-23T02:11:24.9991865Z 12 CALL_METHOD 0 2022-11-23T02:11:24.9992134Z 14 CALL_METHOD 2 2022-11-23T02:11:24.9992406Z 16 RETURN_VALUE 2022-11-23T02:11:24.9992650Z 2022-11-23T02:11:24.9992787Z 2022-11-23T02:11:24.9993219Z [2022-11-23 02:11:12,399] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:70 2022-11-23T02:11:24.9993912Z [2022-11-23 02:11:12,399] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:24.9994626Z [2022-11-23 02:11:12,399] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR mm [TorchVariable()] 2022-11-23T02:11:24.9995562Z [2022-11-23 02:11:12,400] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [TorchVariable()] 2022-11-23T02:11:24.9996314Z [2022-11-23 02:11:12,400] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TorchVariable(), TensorVariable()] 2022-11-23T02:11:24.9997144Z [2022-11-23 02:11:12,400] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR weight [TorchVariable(), TensorVariable(), NNModuleVariable()] 2022-11-23T02:11:24.9997982Z [2022-11-23 02:11:12,402] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR t [TorchVariable(), TensorVariable(), TensorVariable()] 2022-11-23T02:11:24.9998869Z [2022-11-23 02:11:12,402] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [TorchVariable(), TensorVariable(), GetAttrVariable(TensorVariable(), t)] 2022-11-23T02:11:24.9999828Z [2022-11-23 02:11:12,403] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(), TensorVariable(), TensorVariable()] 2022-11-23T02:11:25.0000512Z [2022-11-23 02:11:12,404] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0001296Z [2022-11-23 02:11:12,404] torch._dynamo.symbolic_convert: [DEBUG] DONE INLINING 2022-11-23T02:11:25.0002154Z [2022-11-23 02:11:12,407] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0002649Z 78 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0002932Z 2 LOAD_METHOD 0 (linear) 2022-11-23T02:11:25.0003232Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0003521Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0003795Z 8 RETURN_VALUE 2022-11-23T02:11:25.0004019Z 2022-11-23T02:11:25.0004157Z 2022-11-23T02:11:25.0004600Z [2022-11-23 02:11:12,407] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:78 2022-11-23T02:11:25.0005201Z [2022-11-23 02:11:12,407] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0005746Z [2022-11-23 02:11:12,407] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR linear [NNModuleVariable()] 2022-11-23T02:11:25.0006336Z [2022-11-23 02:11:12,408] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [NNModuleVariable()] 2022-11-23T02:11:25.0006953Z [2022-11-23 02:11:12,408] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0007564Z [2022-11-23 02:11:12,414] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0008326Z [2022-11-23 02:11:12,414] torch._dynamo.symbolic_convert: [DEBUG] DONE INLINING 2022-11-23T02:11:25.0009011Z [2022-11-23 02:11:12,417] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0009586Z [2022-11-23 02:11:12,417] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward 2022-11-23T02:11:25.0010158Z [2022-11-23 02:11:12,418] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function compile_fn 2022-11-23T02:11:25.0010910Z [2022-11-23 02:11:12,418] torch._dynamo.optimizations.distributed: [INFO] DDPOptimizer used bucket cap 1048576 and produced the following buckets: 2022-11-23T02:11:25.0011659Z [2022-11-23 02:11:12,419] torch._dynamo.optimizations.distributed: [INFO] Please `pip install tabulate` in order to pretty-print ddp bucket sizes 2022-11-23T02:11:25.0012263Z [2022-11-23 02:11:12,422] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0012639Z ---orig graph--- 2022-11-23T02:11:25.0012863Z graph(): 2022-11-23T02:11:25.0013159Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0013573Z %self_seq_0_linear : [#users=1] = call_module[target=self_seq_0_linear](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0014011Z %self_seq_1 : [#users=1] = call_module[target=self_seq_1](args = (%self_seq_0_linear,), kwargs = {}) 2022-11-23T02:11:25.0014431Z %self_seq_2_weight : [#users=1] = get_attr[target=self_seq_2_weight] 2022-11-23T02:11:25.0014841Z %t : [#users=1] = call_method[target=t](args = (%self_seq_2_weight,), kwargs = {}) 2022-11-23T02:11:25.0015265Z %mm : [#users=1] = call_function[target=torch.mm](args = (%self_seq_1, %t), kwargs = {}) 2022-11-23T02:11:25.0015672Z %self_seq_3 : [#users=1] = call_module[target=self_seq_3](args = (%mm,), kwargs = {}) 2022-11-23T02:11:25.0015967Z %self_seq_4_linear : [#users=1] = call_module[target=self_seq_4_linear](args = (%self_seq_3,), kwargs = {}) 2022-11-23T02:11:25.0016203Z %self_seq_5 : [#users=1] = call_module[target=self_seq_5](args = (%self_seq_4_linear,), kwargs = {}) 2022-11-23T02:11:25.0016324Z return (self_seq_5,) 2022-11-23T02:11:25.0016345Z 2022-11-23T02:11:25.0016501Z ---split graph--- 2022-11-23T02:11:25.0016607Z graph(): 2022-11-23T02:11:25.0016763Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0016950Z %self_seq_2_weight : [#users=1] = get_attr[target=self_seq_2_weight] 2022-11-23T02:11:25.0017153Z %submod_0 : [#users=1] = call_module[target=submod_0](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0017386Z %submod_1 : [#users=1] = call_module[target=submod_1](args = (%self_seq_2_weight, %submod_0), kwargs = {}) 2022-11-23T02:11:25.0017590Z %submod_2 : [#users=1] = call_module[target=submod_2](args = (%submod_1,), kwargs = {}) 2022-11-23T02:11:25.0017711Z return (submod_2,) 2022-11-23T02:11:25.0017731Z 2022-11-23T02:11:25.0017894Z ---submod_0 graph--- 2022-11-23T02:11:25.0018000Z graph(): 2022-11-23T02:11:25.0018154Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0018377Z %self_seq_0_linear : [#users=1] = call_module[target=self_seq_0_linear](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0018599Z %self_seq_1 : [#users=1] = call_module[target=self_seq_1](args = (%self_seq_0_linear,), kwargs = {}) 2022-11-23T02:11:25.0018715Z return self_seq_1 2022-11-23T02:11:25.0018736Z 2022-11-23T02:11:25.0018900Z ---submod_1 graph--- 2022-11-23T02:11:25.0019005Z graph(): 2022-11-23T02:11:25.0019201Z %self_seq_2_weight : [#users=1] = placeholder[target=self_seq_2_weight] 2022-11-23T02:11:25.0019378Z %self_seq_1 : [#users=1] = placeholder[target=self_seq_1] 2022-11-23T02:11:25.0019561Z %t : [#users=1] = call_method[target=t](args = (%self_seq_2_weight,), kwargs = {}) 2022-11-23T02:11:25.0019774Z %mm : [#users=1] = call_function[target=torch.mm](args = (%self_seq_1, %t), kwargs = {}) 2022-11-23T02:11:25.0019975Z %self_seq_3 : [#users=1] = call_module[target=self_seq_3](args = (%mm,), kwargs = {}) 2022-11-23T02:11:25.0020091Z return self_seq_3 2022-11-23T02:11:25.0020111Z 2022-11-23T02:11:25.0020269Z ---submod_2 graph--- 2022-11-23T02:11:25.0020372Z graph(): 2022-11-23T02:11:25.0020549Z %self_seq_3 : [#users=1] = placeholder[target=self_seq_3] 2022-11-23T02:11:25.0020786Z %self_seq_4_linear : [#users=1] = call_module[target=self_seq_4_linear](args = (%self_seq_3,), kwargs = {}) 2022-11-23T02:11:25.0021070Z %self_seq_5 : [#users=1] = call_module[target=self_seq_5](args = (%self_seq_4_linear,), kwargs = {}) 2022-11-23T02:11:25.0021185Z return self_seq_5 2022-11-23T02:11:25.0021205Z 2022-11-23T02:11:25.0021349Z --------------- 2022-11-23T02:11:25.0021368Z 2022-11-23T02:11:25.0021748Z [2022-11-23 02:11:12,422] torch._dynamo.optimizations.distributed: [DEBUG] run_node placeholder, x got args tuple() 2022-11-23T02:11:25.0022148Z [2022-11-23 02:11:12,422] torch._dynamo.optimizations.distributed: [DEBUG] run_node get_attr, self_seq_2_weight got args tuple() 2022-11-23T02:11:25.0022577Z [2022-11-23 02:11:12,422] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_0 got args tuple(T[torch.Size([512, 512])]) 2022-11-23T02:11:25.0022868Z [2022-11-23 02:11:12,422] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0023022Z ---submod_0 graph--- 2022-11-23T02:11:25.0023106Z graph(): 2022-11-23T02:11:25.0023281Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0023507Z %self_seq_0_linear : [#users=1] = call_module[target=self_seq_0_linear](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0023729Z %self_seq_1 : [#users=1] = call_module[target=self_seq_1](args = (%self_seq_0_linear,), kwargs = {}) 2022-11-23T02:11:25.0023844Z return self_seq_1 2022-11-23T02:11:25.0024394Z [2022-11-23 02:11:12,423] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_1 got args tuple(T[torch.Size([512, 512])], T[torch.Size([512, 512])]) 2022-11-23T02:11:25.0024699Z [2022-11-23 02:11:12,423] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0024855Z ---submod_1 graph--- 2022-11-23T02:11:25.0024938Z graph(): 2022-11-23T02:11:25.0025136Z %self_seq_2_weight : [#users=1] = placeholder[target=self_seq_2_weight] 2022-11-23T02:11:25.0025315Z %self_seq_1 : [#users=1] = placeholder[target=self_seq_1] 2022-11-23T02:11:25.0025518Z %t : [#users=1] = call_method[target=t](args = (%self_seq_2_weight,), kwargs = {}) 2022-11-23T02:11:25.0025735Z %mm : [#users=1] = call_function[target=torch.mm](args = (%self_seq_1, %t), kwargs = {}) 2022-11-23T02:11:25.0025936Z %self_seq_3 : [#users=1] = call_module[target=self_seq_3](args = (%mm,), kwargs = {}) 2022-11-23T02:11:25.0026054Z return self_seq_3 2022-11-23T02:11:25.0026498Z [2022-11-23 02:11:12,424] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_2 got args tuple(T[torch.Size([512, 512])]) 2022-11-23T02:11:25.0026771Z [2022-11-23 02:11:12,424] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0026926Z ---submod_2 graph--- 2022-11-23T02:11:25.0027030Z graph(): 2022-11-23T02:11:25.0027206Z %self_seq_3 : [#users=1] = placeholder[target=self_seq_3] 2022-11-23T02:11:25.0027441Z %self_seq_4_linear : [#users=1] = call_module[target=self_seq_4_linear](args = (%self_seq_3,), kwargs = {}) 2022-11-23T02:11:25.0027664Z %self_seq_5 : [#users=1] = call_module[target=self_seq_5](args = (%self_seq_4_linear,), kwargs = {}) 2022-11-23T02:11:25.0027785Z return self_seq_5 2022-11-23T02:11:25.0028220Z [2022-11-23 02:11:12,425] torch._dynamo.optimizations.distributed: [DEBUG] run_node output, output got args tuple(tuple(T[torch.Size([512, 512])])) 2022-11-23T02:11:25.0028490Z [2022-11-23 02:11:12,425] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0028640Z ---final graph--- 2022-11-23T02:11:25.0028749Z graph(): 2022-11-23T02:11:25.0028925Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0029112Z %self_seq_2_weight : [#users=1] = get_attr[target=self_seq_2_weight] 2022-11-23T02:11:25.0029327Z %submod_0 : [#users=1] = call_module[target=compiled_submod_0](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0029573Z %submod_1 : [#users=1] = call_module[target=compiled_submod_1](args = (%self_seq_2_weight, %submod_0), kwargs = {}) 2022-11-23T02:11:25.0029798Z %submod_2 : [#users=1] = call_module[target=compiled_submod_2](args = (%submod_1,), kwargs = {}) 2022-11-23T02:11:25.0029971Z return (submod_2,) 2022-11-23T02:11:25.0030117Z --------------- 2022-11-23T02:11:25.0030138Z 2022-11-23T02:11:25.0030470Z [2022-11-23 02:11:12,425] torch._dynamo.output_graph: [INFO] Step 2: done compiler function compile_fn 2022-11-23T02:11:25.0030739Z [2022-11-23 02:11:12,426] torch._dynamo.output_graph: [CODE] TRACED GRAPH 2022-11-23T02:11:25.0030935Z __compiled_fn_1 .26 opcode, name, target, args, kwargs 2022-11-23T02:11:25.0031063Z placeholder, x, x, (), {} 2022-11-23T02:11:25.0031240Z call_module, self_seq_0_linear, self_seq_0_linear, (x,), {} 2022-11-23T02:11:25.0031417Z call_module, self_seq_1, self_seq_1, (self_seq_0_linear,), {} 2022-11-23T02:11:25.0031562Z get_attr, self_seq_2_weight, self_seq_2_weight, (), {} 2022-11-23T02:11:25.0031711Z call_method, t, t, (self_seq_2_weight,), {} 2022-11-23T02:11:25.0032033Z call_function, mm, , (self_seq_1, t), {} 2022-11-23T02:11:25.0032187Z call_module, self_seq_3, self_seq_3, (mm,), {} 2022-11-23T02:11:25.0032377Z call_module, self_seq_4_linear, self_seq_4_linear, (self_seq_3,), {} 2022-11-23T02:11:25.0032553Z call_module, self_seq_5, self_seq_5, (self_seq_4_linear,), {} 2022-11-23T02:11:25.0032705Z output, output, output, ((self_seq_5,),), {} 2022-11-23T02:11:25.0032726Z 2022-11-23T02:11:25.0033249Z [2022-11-23 02:11:12,426] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 91 2022-11-23T02:11:25.0033380Z 92 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0033520Z 2 LOAD_METHOD 0 (seq) 2022-11-23T02:11:25.0033657Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0033787Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0033903Z 8 RETURN_VALUE 2022-11-23T02:11:25.0033923Z 2022-11-23T02:11:25.0034020Z 2022-11-23T02:11:25.0034488Z [2022-11-23 02:11:12,426] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 91 2022-11-23T02:11:25.0034633Z 91 0 LOAD_GLOBAL 1 (__compiled_fn_1) 2022-11-23T02:11:25.0034768Z 2 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0034899Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0035266Z 6 UNPACK_SEQUENCE 1 2022-11-23T02:11:25.0035398Z 8 RETURN_VALUE 2022-11-23T02:11:25.0035418Z 2022-11-23T02:11:25.0035542Z 2022-11-23T02:11:25.0035807Z [2022-11-23 02:11:12,427] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0035918Z - 2022-11-23T02:11:25.0036071Z local 'x' TENSOR_MATCH 2022-11-23T02:11:25.0036173Z { 2022-11-23T02:11:25.0036379Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0036537Z 'code': None, 2022-11-23T02:11:25.0036833Z 'obj_weakref': 2022-11-23T02:11:25.0037181Z 'guarded_class': 2022-11-23T02:11:25.0037284Z } 2022-11-23T02:11:25.0037364Z 2022-11-23T02:11:25.0037476Z - 2022-11-23T02:11:25.0037651Z local 'self' NN_MODULE 2022-11-23T02:11:25.0037753Z { 2022-11-23T02:11:25.0037951Z 'guard_types': ['ID_MATCH'], 2022-11-23T02:11:25.0038190Z 'code': ['___check_obj_id(self, 140050945848944)'], 2022-11-23T02:11:25.0038483Z 'obj_weakref': 2022-11-23T02:11:25.0038782Z 'guarded_class': 2022-11-23T02:11:25.0038882Z } 2022-11-23T02:11:25.0038981Z 2022-11-23T02:11:25.0039091Z - 2022-11-23T02:11:25.0039278Z global 'torch' FUNCTION_MATCH 2022-11-23T02:11:25.0039475Z { 2022-11-23T02:11:25.0039656Z 'guard_types': None, 2022-11-23T02:11:25.0039794Z 'code': None, 2022-11-23T02:11:25.0039965Z 'obj_weakref': None 2022-11-23T02:11:25.0040139Z 'guarded_class': None 2022-11-23T02:11:25.0040237Z } 2022-11-23T02:11:25.0040335Z 2022-11-23T02:11:25.0040446Z - 2022-11-23T02:11:25.0040641Z local_nn_module 'self.seq' NN_MODULE 2022-11-23T02:11:25.0040742Z { 2022-11-23T02:11:25.0040913Z 'guard_types': None, 2022-11-23T02:11:25.0041068Z 'code': None, 2022-11-23T02:11:25.0041237Z 'obj_weakref': None 2022-11-23T02:11:25.0041410Z 'guarded_class': None 2022-11-23T02:11:25.0041510Z } 2022-11-23T02:11:25.0041588Z 2022-11-23T02:11:25.0041696Z - 2022-11-23T02:11:25.0041908Z local_nn_module 'self.seq[0]' NN_MODULE 2022-11-23T02:11:25.0042009Z { 2022-11-23T02:11:25.0042185Z 'guard_types': None, 2022-11-23T02:11:25.0042339Z 'code': None, 2022-11-23T02:11:25.0042508Z 'obj_weakref': None 2022-11-23T02:11:25.0042662Z 'guarded_class': None 2022-11-23T02:11:25.0042762Z } 2022-11-23T02:11:25.0042861Z 2022-11-23T02:11:25.0042970Z - 2022-11-23T02:11:25.0043254Z local_nn_module 'self.seq[1]' NN_MODULE 2022-11-23T02:11:25.0043365Z { 2022-11-23T02:11:25.0043522Z 'guard_types': None, 2022-11-23T02:11:25.0043678Z 'code': None, 2022-11-23T02:11:25.0043846Z 'obj_weakref': None 2022-11-23T02:11:25.0044019Z 'guarded_class': None 2022-11-23T02:11:25.0044119Z } 2022-11-23T02:11:25.0044218Z 2022-11-23T02:11:25.0044324Z - 2022-11-23T02:11:25.0044516Z local_nn_module 'self.seq[2]' NN_MODULE 2022-11-23T02:11:25.0044616Z { 2022-11-23T02:11:25.0044794Z 'guard_types': None, 2022-11-23T02:11:25.0044948Z 'code': None, 2022-11-23T02:11:25.0045114Z 'obj_weakref': None 2022-11-23T02:11:25.0045287Z 'guarded_class': None 2022-11-23T02:11:25.0045386Z } 2022-11-23T02:11:25.0045467Z 2022-11-23T02:11:25.0045575Z - 2022-11-23T02:11:25.0045786Z local_nn_module 'self.seq[3]' NN_MODULE 2022-11-23T02:11:25.0045886Z { 2022-11-23T02:11:25.0046054Z 'guard_types': None, 2022-11-23T02:11:25.0046208Z 'code': None, 2022-11-23T02:11:25.0046377Z 'obj_weakref': None 2022-11-23T02:11:25.0046530Z 'guarded_class': None 2022-11-23T02:11:25.0046629Z } 2022-11-23T02:11:25.0046727Z 2022-11-23T02:11:25.0046836Z - 2022-11-23T02:11:25.0047043Z local_nn_module 'self.seq[4]' NN_MODULE 2022-11-23T02:11:25.0047144Z { 2022-11-23T02:11:25.0047297Z 'guard_types': None, 2022-11-23T02:11:25.0047457Z 'code': None, 2022-11-23T02:11:25.0047626Z 'obj_weakref': None 2022-11-23T02:11:25.0047797Z 'guarded_class': None 2022-11-23T02:11:25.0047896Z } 2022-11-23T02:11:25.0047994Z 2022-11-23T02:11:25.0048101Z - 2022-11-23T02:11:25.0048288Z local_nn_module 'self.seq[5]' NN_MODULE 2022-11-23T02:11:25.0048392Z { 2022-11-23T02:11:25.0048565Z 'guard_types': None, 2022-11-23T02:11:25.0048720Z 'code': None, 2022-11-23T02:11:25.0048887Z 'obj_weakref': None 2022-11-23T02:11:25.0049057Z 'guarded_class': None 2022-11-23T02:11:25.0049158Z } 2022-11-23T02:11:25.0049236Z 2022-11-23T02:11:25.0049344Z - 2022-11-23T02:11:25.0049574Z local_nn_module 'self.seq[0].linear' NN_MODULE 2022-11-23T02:11:25.0049675Z { 2022-11-23T02:11:25.0049847Z 'guard_types': None, 2022-11-23T02:11:25.0050075Z 'code': None, 2022-11-23T02:11:25.0050224Z 'obj_weakref': None 2022-11-23T02:11:25.0050394Z 'guarded_class': None 2022-11-23T02:11:25.0050492Z } 2022-11-23T02:11:25.0050592Z 2022-11-23T02:11:25.0050703Z - 2022-11-23T02:11:25.0050943Z local_nn_module 'self.seq[2].weight' TENSOR_MATCH 2022-11-23T02:11:25.0051042Z { 2022-11-23T02:11:25.0051200Z 'guard_types': None, 2022-11-23T02:11:25.0051357Z 'code': None, 2022-11-23T02:11:25.0051524Z 'obj_weakref': None 2022-11-23T02:11:25.0051695Z 'guarded_class': None 2022-11-23T02:11:25.0051793Z } 2022-11-23T02:11:25.0051892Z 2022-11-23T02:11:25.0052001Z - 2022-11-23T02:11:25.0052213Z local_nn_module 'self.seq[4].linear' NN_MODULE 2022-11-23T02:11:25.0052317Z { 2022-11-23T02:11:25.0052489Z 'guard_types': None, 2022-11-23T02:11:25.0052650Z 'code': None, 2022-11-23T02:11:25.0052819Z 'obj_weakref': None 2022-11-23T02:11:25.0052989Z 'guarded_class': None 2022-11-23T02:11:25.0053089Z } 2022-11-23T02:11:25.0053168Z 2022-11-23T02:11:25.0053341Z frames [('total', 4), ('ok', 4)] 2022-11-23T02:11:25.0053653Z inline_call [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0053831Z unimplemented [] 2022-11-23T02:11:25.0054152Z graph_break [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0054432Z stats [('calls_captured', 8), ('fusions_possible', 7), ('unique_graphs', 1)] 2022-11-23T02:11:25.0054619Z aot_autograd [('total', 3), ('ok', 3)] 2022-11-23T02:11:25.0054771Z frames [('total', 2), ('ok', 2)] 2022-11-23T02:11:25.0055079Z inline_call [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0055195Z unimplemented [] 2022-11-23T02:11:25.0055502Z graph_break [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0055786Z stats [('calls_captured', 7), ('fusions_possible', 6), ('unique_graphs', 1)] 2022-11-23T02:11:25.0055894Z ok (0.069s) 2022-11-23T02:11:25.0056358Z test_ddp_baseline_aot_eager (__main__.TestDistributed) ... [2022-11-23 02:11:12,931] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward 2022-11-23T02:11:25.0056802Z [2022-11-23 02:11:12,932] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:54 2022-11-23T02:11:25.0057081Z [2022-11-23 02:11:12,932] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0057421Z [2022-11-23 02:11:12,932] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR net [NNModuleVariable()] 2022-11-23T02:11:25.0057768Z [2022-11-23 02:11:12,932] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [NNModuleVariable()] 2022-11-23T02:11:25.0058151Z [2022-11-23 02:11:12,932] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0058489Z [2022-11-23 02:11:12,968] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0058821Z [2022-11-23 02:11:12,968] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward 2022-11-23T02:11:25.0059157Z [2022-11-23 02:11:12,969] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function compile_fn 2022-11-23T02:11:25.0059487Z [2022-11-23 02:11:13,000] torch._dynamo.output_graph: [INFO] Step 2: done compiler function compile_fn 2022-11-23T02:11:25.0059754Z [2022-11-23 02:11:13,000] torch._dynamo.output_graph: [CODE] TRACED GRAPH 2022-11-23T02:11:25.0059930Z __compiled_fn_2 .36 opcode, name, target, args, kwargs 2022-11-23T02:11:25.0060076Z placeholder, inputs, inputs, (), {} 2022-11-23T02:11:25.0060237Z call_module, self_net_0, self_net_0, (inputs,), {} 2022-11-23T02:11:25.0060467Z call_module, self_net_1, self_net_1, (self_net_0,), {} 2022-11-23T02:11:25.0060631Z call_module, self_net_2, self_net_2, (self_net_1,), {} 2022-11-23T02:11:25.0060792Z call_module, self_net_3, self_net_3, (self_net_2,), {} 2022-11-23T02:11:25.0060950Z call_module, self_net_4, self_net_4, (self_net_3,), {} 2022-11-23T02:11:25.0061087Z call_module, self_net_5, self_net_5, (self_net_4,), {} 2022-11-23T02:11:25.0061248Z call_module, self_net_6, self_net_6, (self_net_5,), {} 2022-11-23T02:11:25.0061406Z call_module, self_net_7, self_net_7, (self_net_6,), {} 2022-11-23T02:11:25.0061561Z output, output, output, ((self_net_7,),), {} 2022-11-23T02:11:25.0061582Z 2022-11-23T02:11:25.0062051Z [2022-11-23 02:11:13,001] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:25.0062193Z 54 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0062334Z 2 LOAD_METHOD 0 (net) 2022-11-23T02:11:25.0062478Z 4 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0062590Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0062708Z 8 RETURN_VALUE 2022-11-23T02:11:25.0062727Z 2022-11-23T02:11:25.0062824Z 2022-11-23T02:11:25.0063386Z [2022-11-23 02:11:13,001] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:25.0063555Z 53 0 LOAD_GLOBAL 1 (__compiled_fn_2) 2022-11-23T02:11:25.0063696Z 2 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0063829Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0063964Z 6 UNPACK_SEQUENCE 1 2022-11-23T02:11:25.0064060Z 8 RETURN_VALUE 2022-11-23T02:11:25.0064079Z 2022-11-23T02:11:25.0064176Z 2022-11-23T02:11:25.0064439Z [2022-11-23 02:11:13,001] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0064557Z - 2022-11-23T02:11:25.0064731Z local 'self' NN_MODULE 2022-11-23T02:11:25.0064833Z { 2022-11-23T02:11:25.0065028Z 'guard_types': ['ID_MATCH'], 2022-11-23T02:11:25.0065247Z 'code': ['___check_obj_id(self, 140050945842656)'], 2022-11-23T02:11:25.0065548Z 'obj_weakref': 2022-11-23T02:11:25.0065868Z 'guarded_class': 2022-11-23T02:11:25.0065973Z } 2022-11-23T02:11:25.0066072Z 2022-11-23T02:11:25.0066182Z - 2022-11-23T02:11:25.0066367Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:25.0066447Z { 2022-11-23T02:11:25.0066651Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0066807Z 'code': None, 2022-11-23T02:11:25.0067099Z 'obj_weakref': 2022-11-23T02:11:25.0067446Z 'guarded_class': 2022-11-23T02:11:25.0067548Z } 2022-11-23T02:11:25.0067647Z 2022-11-23T02:11:25.0067737Z - 2022-11-23T02:11:25.0067945Z local_nn_module 'self.net' NN_MODULE 2022-11-23T02:11:25.0068044Z { 2022-11-23T02:11:25.0068222Z 'guard_types': None, 2022-11-23T02:11:25.0068382Z 'code': None, 2022-11-23T02:11:25.0068552Z 'obj_weakref': None 2022-11-23T02:11:25.0068725Z 'guarded_class': None 2022-11-23T02:11:25.0068805Z } 2022-11-23T02:11:25.0068903Z 2022-11-23T02:11:25.0069013Z - 2022-11-23T02:11:25.0069228Z local_nn_module 'self.net[0]' NN_MODULE 2022-11-23T02:11:25.0069329Z { 2022-11-23T02:11:25.0069502Z 'guard_types': None, 2022-11-23T02:11:25.0069656Z 'code': None, 2022-11-23T02:11:25.0069884Z 'obj_weakref': None 2022-11-23T02:11:25.0070055Z 'guarded_class': None 2022-11-23T02:11:25.0070155Z } 2022-11-23T02:11:25.0070254Z 2022-11-23T02:11:25.0070361Z - 2022-11-23T02:11:25.0070572Z local_nn_module 'self.net[1]' NN_MODULE 2022-11-23T02:11:25.0070652Z { 2022-11-23T02:11:25.0070831Z 'guard_types': None, 2022-11-23T02:11:25.0070987Z 'code': None, 2022-11-23T02:11:25.0071155Z 'obj_weakref': None 2022-11-23T02:11:25.0071326Z 'guarded_class': None 2022-11-23T02:11:25.0071425Z } 2022-11-23T02:11:25.0071523Z 2022-11-23T02:11:25.0071613Z - 2022-11-23T02:11:25.0071824Z local_nn_module 'self.net[2]' NN_MODULE 2022-11-23T02:11:25.0071925Z { 2022-11-23T02:11:25.0072098Z 'guard_types': None, 2022-11-23T02:11:25.0072252Z 'code': None, 2022-11-23T02:11:25.0072427Z 'obj_weakref': None 2022-11-23T02:11:25.0072599Z 'guarded_class': None 2022-11-23T02:11:25.0072678Z } 2022-11-23T02:11:25.0072777Z 2022-11-23T02:11:25.0072887Z - 2022-11-23T02:11:25.0073096Z local_nn_module 'self.net[3]' NN_MODULE 2022-11-23T02:11:25.0073197Z { 2022-11-23T02:11:25.0073370Z 'guard_types': None, 2022-11-23T02:11:25.0073562Z 'code': None, 2022-11-23T02:11:25.0073748Z 'obj_weakref': None 2022-11-23T02:11:25.0073917Z 'guarded_class': None 2022-11-23T02:11:25.0074016Z } 2022-11-23T02:11:25.0074117Z 2022-11-23T02:11:25.0074227Z - 2022-11-23T02:11:25.0074436Z local_nn_module 'self.net[4]' NN_MODULE 2022-11-23T02:11:25.0074516Z { 2022-11-23T02:11:25.0074688Z 'guard_types': None, 2022-11-23T02:11:25.0074842Z 'code': None, 2022-11-23T02:11:25.0075010Z 'obj_weakref': None 2022-11-23T02:11:25.0075381Z 'guarded_class': None 2022-11-23T02:11:25.0075481Z } 2022-11-23T02:11:25.0075581Z 2022-11-23T02:11:25.0075669Z - 2022-11-23T02:11:25.0075879Z local_nn_module 'self.net[5]' NN_MODULE 2022-11-23T02:11:25.0075979Z { 2022-11-23T02:11:25.0076153Z 'guard_types': None, 2022-11-23T02:11:25.0076311Z 'code': None, 2022-11-23T02:11:25.0076481Z 'obj_weakref': None 2022-11-23T02:11:25.0076653Z 'guarded_class': None 2022-11-23T02:11:25.0076732Z } 2022-11-23T02:11:25.0076832Z 2022-11-23T02:11:25.0076942Z - 2022-11-23T02:11:25.0077151Z local_nn_module 'self.net[6]' NN_MODULE 2022-11-23T02:11:25.0077251Z { 2022-11-23T02:11:25.0077423Z 'guard_types': None, 2022-11-23T02:11:25.0077560Z 'code': None, 2022-11-23T02:11:25.0077730Z 'obj_weakref': None 2022-11-23T02:11:25.0077909Z 'guarded_class': None 2022-11-23T02:11:25.0078009Z } 2022-11-23T02:11:25.0078108Z 2022-11-23T02:11:25.0078218Z - 2022-11-23T02:11:25.0078428Z local_nn_module 'self.net[7]' NN_MODULE 2022-11-23T02:11:25.0078508Z { 2022-11-23T02:11:25.0078682Z 'guard_types': None, 2022-11-23T02:11:25.0078836Z 'code': None, 2022-11-23T02:11:25.0079006Z 'obj_weakref': None 2022-11-23T02:11:25.0079179Z 'guarded_class': None 2022-11-23T02:11:25.0079278Z } 2022-11-23T02:11:25.0079376Z 2022-11-23T02:11:25.0079528Z frames [('total', 1), ('ok', 1)] 2022-11-23T02:11:25.0079808Z stats [('calls_captured', 8), ('fusions_possible', 7), ('unique_graphs', 1)] 2022-11-23T02:11:25.0079994Z aot_autograd [('total', 1), ('ok', 1)] 2022-11-23T02:11:25.0080100Z ok (0.716s) 2022-11-23T02:11:25.0080377Z test_ddp_baseline_inductor (__main__.TestDistributed) ... skip: Inductor+gpu needs triton and recent GPU arch (0.001s) 2022-11-23T02:11:25.0080747Z test_empty_graph_inductor (__main__.TestDistributed) ... skip: Inductor+gpu needs triton and recent GPU arch (0.001s) 2022-11-23T02:11:25.0080911Z test_graph_split (__main__.TestDistributed) 2022-11-23T02:11:25.0081396Z Just ensures that the appropriate number of splits happen (based on ... [2022-11-23 02:11:13,647] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing opt_fn 2022-11-23T02:11:25.0081844Z [2022-11-23 02:11:13,647] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:381 2022-11-23T02:11:25.0082150Z [2022-11-23 02:11:13,647] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF ddp_m [] 2022-11-23T02:11:25.0082598Z [2022-11-23 02:11:13,648] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0083075Z [2022-11-23 02:11:13,648] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UnspecializedNNModuleVariable(DistributedDataParallel), TensorVariable()] 2022-11-23T02:11:25.0083621Z [2022-11-23 02:11:13,649] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0083836Z 1057 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0083992Z 2 LOAD_ATTR 1 (autograd) 2022-11-23T02:11:25.0084138Z 4 LOAD_ATTR 2 (profiler) 2022-11-23T02:11:25.0084297Z 6 LOAD_METHOD 3 (record_function) 2022-11-23T02:11:25.0084319Z 2022-11-23T02:11:25.0084594Z 1058 8 LOAD_CONST 1 ('DistributedDataParallel.forward') 2022-11-23T02:11:25.0084633Z 2022-11-23T02:11:25.0084748Z 1057 10 CALL_METHOD 1 2022-11-23T02:11:25.0084890Z 12 SETUP_WITH 147 (to 308) 2022-11-23T02:11:25.0085009Z 14 POP_TOP 2022-11-23T02:11:25.0085030Z 2022-11-23T02:11:25.0085169Z 1060 16 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0085324Z 18 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0085458Z 20 CALL_METHOD 0 2022-11-23T02:11:25.0085604Z 22 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:25.0085724Z 24 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0085896Z 26 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:25.0086043Z 28 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:25.0086062Z 2022-11-23T02:11:25.0086196Z 1061 30 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0086335Z 32 LOAD_ATTR 6 (logger) 2022-11-23T02:11:25.0086472Z 34 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0086601Z 36 IS_OP 1 2022-11-23T02:11:25.0086749Z 38 POP_JUMP_IF_TRUE 22 (to 44) 2022-11-23T02:11:25.0086859Z 40 LOAD_ASSERTION_ERROR 2022-11-23T02:11:25.0086998Z 42 RAISE_VARARGS 1 2022-11-23T02:11:25.0087017Z 2022-11-23T02:11:25.0087150Z 1062 >> 44 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0087287Z 46 LOAD_ATTR 6 (logger) 2022-11-23T02:11:25.0087461Z 48 LOAD_METHOD 7 (set_runtime_stats_and_log) 2022-11-23T02:11:25.0087598Z 50 CALL_METHOD 0 2022-11-23T02:11:25.0087708Z 52 POP_TOP 2022-11-23T02:11:25.0087727Z 2022-11-23T02:11:25.0087862Z 1063 54 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0087951Z 56 DUP_TOP 2022-11-23T02:11:25.0088105Z 58 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0088240Z 60 LOAD_CONST 2 (1) 2022-11-23T02:11:25.0088356Z 62 INPLACE_ADD 2022-11-23T02:11:25.0088538Z 64 ROT_TWO 2022-11-23T02:11:25.0088694Z 66 STORE_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0088713Z 2022-11-23T02:11:25.0088848Z 1064 68 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0088970Z 70 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0089135Z 72 LOAD_METHOD 10 (prepare_for_forward) 2022-11-23T02:11:25.0089271Z 74 CALL_METHOD 0 2022-11-23T02:11:25.0089382Z 76 POP_TOP 2022-11-23T02:11:25.0089401Z 2022-11-23T02:11:25.0089536Z 1068 >> 78 LOAD_GLOBAL 11 (Join) 2022-11-23T02:11:25.0089694Z 80 LOAD_METHOD 12 (notify_join_context) 2022-11-23T02:11:25.0089826Z 82 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0089937Z 84 CALL_METHOD 1 2022-11-23T02:11:25.0090073Z 86 STORE_FAST 3 (work) 2022-11-23T02:11:25.0090095Z 2022-11-23T02:11:25.0090273Z 1069 88 LOAD_FAST 3 (work) 2022-11-23T02:11:25.0090424Z 90 POP_JUMP_IF_FALSE 54 (to 108) 2022-11-23T02:11:25.0090443Z 2022-11-23T02:11:25.0090573Z 1070 92 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0090715Z 94 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0090964Z 96 LOAD_METHOD 13 (_set_forward_pass_work_handle) 2022-11-23T02:11:25.0090985Z 2022-11-23T02:11:25.0091128Z 1071 98 LOAD_FAST 3 (work) 2022-11-23T02:11:25.0091260Z 100 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0091417Z 102 LOAD_ATTR 14 (_divide_by_initial_world_size) 2022-11-23T02:11:25.0091436Z 2022-11-23T02:11:25.0091572Z 1070 104 CALL_METHOD 2 2022-11-23T02:11:25.0091682Z 106 POP_TOP 2022-11-23T02:11:25.0091702Z 2022-11-23T02:11:25.0091843Z 1080 >> 108 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0092005Z 110 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0092136Z 112 CALL_METHOD 0 2022-11-23T02:11:25.0092284Z 114 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:25.0092417Z 116 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0092537Z 118 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0092697Z 120 LOAD_METHOD 15 (_rebuild_buckets) 2022-11-23T02:11:25.0092829Z 122 CALL_METHOD 0 2022-11-23T02:11:25.0092975Z 124 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:25.0092995Z 2022-11-23T02:11:25.0093134Z 1081 126 LOAD_GLOBAL 6 (logger) 2022-11-23T02:11:25.0093268Z 128 LOAD_METHOD 16 (info) 2022-11-23T02:11:25.0093287Z 2022-11-23T02:11:25.0093611Z 1082 130 LOAD_CONST 3 ('Reducer buckets have been rebuilt in this iteration.') 2022-11-23T02:11:25.0093635Z 2022-11-23T02:11:25.0093769Z 1081 132 CALL_METHOD 1 2022-11-23T02:11:25.0093859Z 134 POP_TOP 2022-11-23T02:11:25.0093879Z 2022-11-23T02:11:25.0094017Z 1084 136 LOAD_CONST 4 (True) 2022-11-23T02:11:25.0094151Z 138 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0094313Z 140 STORE_ATTR 17 (_has_rebuilt_buckets) 2022-11-23T02:11:25.0094336Z 2022-11-23T02:11:25.0094479Z 1088 >> 142 LOAD_GLOBAL 18 (hasattr) 2022-11-23T02:11:25.0094614Z 144 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0094842Z 146 LOAD_CONST 5 ('buffer_hook') 2022-11-23T02:11:25.0094978Z 148 CALL_FUNCTION 2 2022-11-23T02:11:25.0095130Z 150 STORE_FAST 4 (buffer_hook_registered) 2022-11-23T02:11:25.0095150Z 2022-11-23T02:11:25.0095285Z 1089 152 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0095523Z 154 LOAD_METHOD 19 (_check_sync_bufs_pre_fwd) 2022-11-23T02:11:25.0095655Z 156 CALL_METHOD 0 2022-11-23T02:11:25.0095802Z 158 POP_JUMP_IF_FALSE 84 (to 168) 2022-11-23T02:11:25.0095821Z 2022-11-23T02:11:25.0095954Z 1090 160 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0096105Z 162 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:25.0096240Z 164 CALL_METHOD 0 2022-11-23T02:11:25.0096333Z 166 POP_TOP 2022-11-23T02:11:25.0096352Z 2022-11-23T02:11:25.0096484Z 1092 >> 168 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0096635Z 170 LOAD_ATTR 21 (_join_config) 2022-11-23T02:11:25.0096771Z 172 LOAD_ATTR 22 (enable) 2022-11-23T02:11:25.0096919Z 174 POP_JUMP_IF_FALSE 94 (to 188) 2022-11-23T02:11:25.0096937Z 2022-11-23T02:11:25.0097068Z 1094 176 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0097267Z 178 LOAD_ATTR 23 (_check_global_requires_backward_grad_sync) 2022-11-23T02:11:25.0097287Z 2022-11-23T02:11:25.0097425Z 1095 180 LOAD_CONST 6 (False) 2022-11-23T02:11:25.0097445Z 2022-11-23T02:11:25.0097683Z 1094 182 LOAD_CONST 7 (('is_joined_rank',)) 2022-11-23T02:11:25.0097801Z 184 CALL_FUNCTION_KW 1 2022-11-23T02:11:25.0097966Z 186 POP_TOP 2022-11-23T02:11:25.0097987Z 2022-11-23T02:11:25.0098125Z 1098 >> 188 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0098279Z 190 LOAD_ATTR 24 (_run_ddp_forward) 2022-11-23T02:11:25.0098416Z 192 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0098547Z 194 BUILD_MAP 0 2022-11-23T02:11:25.0098682Z 196 LOAD_FAST 2 (kwargs) 2022-11-23T02:11:25.0098792Z 198 DICT_MERGE 1 2022-11-23T02:11:25.0098929Z 200 CALL_FUNCTION_EX 1 2022-11-23T02:11:25.0099075Z 202 STORE_FAST 5 (output) 2022-11-23T02:11:25.0099094Z 2022-11-23T02:11:25.0099229Z 1102 204 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0099400Z 206 LOAD_METHOD 25 (_check_sync_bufs_post_fwd) 2022-11-23T02:11:25.0099534Z 208 CALL_METHOD 0 2022-11-23T02:11:25.0099685Z 210 POP_JUMP_IF_FALSE 110 (to 220) 2022-11-23T02:11:25.0099704Z 2022-11-23T02:11:25.0099839Z 1103 212 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0099971Z 214 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:25.0100100Z 216 CALL_METHOD 0 2022-11-23T02:11:25.0100209Z 218 POP_TOP 2022-11-23T02:11:25.0100229Z 2022-11-23T02:11:25.0100369Z 1105 >> 220 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0100526Z 222 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0100665Z 224 CALL_METHOD 0 2022-11-23T02:11:25.0100811Z 226 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:25.0100926Z 228 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0101100Z 230 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:25.0101245Z 232 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:25.0101268Z 2022-11-23T02:11:25.0101403Z 1106 234 LOAD_CONST 4 (True) 2022-11-23T02:11:25.0101538Z 236 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0101712Z 238 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:25.0101731Z 2022-11-23T02:11:25.0101866Z 1112 240 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0102037Z 242 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:25.0102183Z 244 POP_JUMP_IF_FALSE 137 (to 274) 2022-11-23T02:11:25.0102363Z 246 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0102515Z 248 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0102662Z 250 POP_JUMP_IF_TRUE 137 (to 274) 2022-11-23T02:11:25.0102681Z 2022-11-23T02:11:25.0102815Z 1114 252 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0102956Z 254 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0103129Z 256 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:25.0103148Z 2022-11-23T02:11:25.0103285Z 1115 258 LOAD_GLOBAL 30 (list) 2022-11-23T02:11:25.0103439Z 260 LOAD_GLOBAL 31 (_find_tensors) 2022-11-23T02:11:25.0103557Z 262 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0103691Z 264 CALL_FUNCTION 1 2022-11-23T02:11:25.0103824Z 266 CALL_FUNCTION 1 2022-11-23T02:11:25.0103843Z 2022-11-23T02:11:25.0103973Z 1114 268 CALL_METHOD 1 2022-11-23T02:11:25.0104088Z 270 POP_TOP 2022-11-23T02:11:25.0104232Z 272 JUMP_FORWARD 10 (to 294) 2022-11-23T02:11:25.0104251Z 2022-11-23T02:11:25.0104383Z 1118 >> 274 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0104504Z 276 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0104722Z 278 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:25.0104866Z 280 BUILD_LIST 0 2022-11-23T02:11:25.0104997Z 282 CALL_METHOD 1 2022-11-23T02:11:25.0105108Z 284 POP_TOP 2022-11-23T02:11:25.0105250Z 286 JUMP_FORWARD 3 (to 294) 2022-11-23T02:11:25.0105269Z 2022-11-23T02:11:25.0105406Z 1120 >> 288 LOAD_CONST 6 (False) 2022-11-23T02:11:25.0105542Z 290 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0105696Z 292 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:25.0105816Z >> 294 POP_BLOCK 2022-11-23T02:11:25.0105835Z 2022-11-23T02:11:25.0105969Z 1057 296 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0106078Z 298 DUP_TOP 2022-11-23T02:11:25.0106184Z 300 DUP_TOP 2022-11-23T02:11:25.0106318Z 302 CALL_FUNCTION 3 2022-11-23T02:11:25.0106426Z 304 POP_TOP 2022-11-23T02:11:25.0106553Z 306 JUMP_FORWARD 8 (to 324) 2022-11-23T02:11:25.0106680Z >> 308 WITH_EXCEPT_START 2022-11-23T02:11:25.0106826Z 310 POP_JUMP_IF_TRUE 157 (to 314) 2022-11-23T02:11:25.0106954Z 312 RERAISE 1 2022-11-23T02:11:25.0107063Z >> 314 POP_TOP 2022-11-23T02:11:25.0107171Z 316 POP_TOP 2022-11-23T02:11:25.0107278Z 318 POP_TOP 2022-11-23T02:11:25.0107373Z 320 POP_EXCEPT 2022-11-23T02:11:25.0107480Z 322 POP_TOP 2022-11-23T02:11:25.0107499Z 2022-11-23T02:11:25.0107636Z 1124 >> 324 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0107803Z 326 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:25.0107950Z 328 POP_JUMP_IF_FALSE 168 (to 336) 2022-11-23T02:11:25.0108084Z 330 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0108236Z 332 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0108366Z 334 POP_JUMP_IF_FALSE 178 (to 356) 2022-11-23T02:11:25.0108405Z 2022-11-23T02:11:25.0108521Z 1125 >> 336 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0108671Z 338 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0108690Z 2022-11-23T02:11:25.0108824Z 1124 340 EXTENDED_ARG 1 2022-11-23T02:11:25.0108971Z 342 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:25.0108990Z 2022-11-23T02:11:25.0109122Z 1125 344 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0109275Z 346 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0109481Z 348 LOAD_CONST 2 (1) 2022-11-23T02:11:25.0109623Z 350 COMPARE_OP 2 (==) 2022-11-23T02:11:25.0109642Z 2022-11-23T02:11:25.0109756Z 1124 352 EXTENDED_ARG 1 2022-11-23T02:11:25.0109907Z 354 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:25.0109926Z 2022-11-23T02:11:25.0110065Z 1128 >> 356 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0110216Z 358 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0110235Z 2022-11-23T02:11:25.0110367Z 1129 360 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0110519Z 362 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0110539Z 2022-11-23T02:11:25.0110811Z 1127 364 LOAD_CONST 8 (('static_graph', 'num_iterations')) 2022-11-23T02:11:25.0110955Z 366 BUILD_CONST_KEY_MAP 2 2022-11-23T02:11:25.0111083Z 368 STORE_FAST 6 (state_dict) 2022-11-23T02:11:25.0111127Z 2022-11-23T02:11:25.0111277Z 1136 370 LOAD_GLOBAL 32 (_tree_flatten_with_rref) 2022-11-23T02:11:25.0111417Z 372 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0111551Z 374 CALL_FUNCTION 1 2022-11-23T02:11:25.0111570Z 2022-11-23T02:11:25.0111762Z 1132 376 UNPACK_SEQUENCE 3 2022-11-23T02:11:25.0111783Z 2022-11-23T02:11:25.0111951Z 1133 378 STORE_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0111971Z 2022-11-23T02:11:25.0112117Z 1134 380 STORE_FAST 8 (treespec) 2022-11-23T02:11:25.0112136Z 2022-11-23T02:11:25.0112290Z 1135 382 STORE_FAST 9 (output_is_rref) 2022-11-23T02:11:25.0112309Z 2022-11-23T02:11:25.0112801Z 1137 384 LOAD_CONST 9 ( at 0x7f6052ac4660, file "/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1137>) 2022-11-23T02:11:25.0113143Z 386 LOAD_CONST 10 ('DistributedDataParallel.forward..') 2022-11-23T02:11:25.0113262Z 388 MAKE_FUNCTION 0 2022-11-23T02:11:25.0113410Z 390 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:25.0113550Z 392 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:25.0113710Z 394 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0113841Z 396 CALL_FUNCTION 1 2022-11-23T02:11:25.0113973Z 398 CALL_FUNCTION 1 2022-11-23T02:11:25.0114082Z 400 GET_ITER 2022-11-23T02:11:25.0114195Z 402 CALL_FUNCTION 1 2022-11-23T02:11:25.0114362Z 404 STORE_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0114382Z 2022-11-23T02:11:25.0114528Z 1140 406 LOAD_GLOBAL 35 (enumerate) 2022-11-23T02:11:25.0114685Z 408 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0114821Z 410 CALL_FUNCTION 1 2022-11-23T02:11:25.0114932Z 412 GET_ITER 2022-11-23T02:11:25.0115290Z >> 414 FOR_ITER 18 (to 452) 2022-11-23T02:11:25.0115438Z 416 UNPACK_SEQUENCE 2 2022-11-23T02:11:25.0115555Z 418 STORE_FAST 11 (i) 2022-11-23T02:11:25.0115697Z 420 STORE_FAST 5 (output) 2022-11-23T02:11:25.0115717Z 2022-11-23T02:11:25.0115857Z 1141 422 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0116001Z 424 LOAD_METHOD 36 (is_tensor) 2022-11-23T02:11:25.0116137Z 426 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0116267Z 428 CALL_METHOD 1 2022-11-23T02:11:25.0116412Z 430 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:25.0116530Z 432 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0116670Z 434 LOAD_ATTR 37 (grad_fn) 2022-11-23T02:11:25.0116908Z 436 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0117034Z 438 IS_OP 0 2022-11-23T02:11:25.0117179Z 440 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:25.0117198Z 2022-11-23T02:11:25.0117332Z 1142 442 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0117501Z 444 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0117637Z 446 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0117734Z 448 STORE_SUBSCR 2022-11-23T02:11:25.0117874Z >> 450 JUMP_ABSOLUTE 207 (to 414) 2022-11-23T02:11:25.0117893Z 2022-11-23T02:11:25.0118034Z 1149 >> 452 LOAD_GLOBAL 38 (_DDPSink) 2022-11-23T02:11:25.0118170Z 454 LOAD_ATTR 39 (apply) 2022-11-23T02:11:25.0118189Z 2022-11-23T02:11:25.0118322Z 1150 456 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0118466Z 458 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0118486Z 2022-11-23T02:11:25.0118633Z 1151 460 LOAD_FAST 6 (state_dict) 2022-11-23T02:11:25.0118652Z 2022-11-23T02:11:25.0118781Z 1149 462 BUILD_LIST 2 2022-11-23T02:11:25.0118800Z 2022-11-23T02:11:25.0118956Z 1152 464 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0118975Z 2022-11-23T02:11:25.0119159Z 1149 466 LIST_EXTEND 1 2022-11-23T02:11:25.0119292Z 468 LIST_TO_TUPLE 2022-11-23T02:11:25.0119425Z 470 CALL_FUNCTION_EX 0 2022-11-23T02:11:25.0119598Z 472 STORE_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:25.0119618Z 2022-11-23T02:11:25.0119755Z 1154 474 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:25.0119891Z 476 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:25.0120052Z 478 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0120171Z 480 CALL_FUNCTION 1 2022-11-23T02:11:25.0120302Z 482 CALL_FUNCTION 1 2022-11-23T02:11:25.0120410Z 484 GET_ITER 2022-11-23T02:11:25.0120548Z >> 486 FOR_ITER 15 (to 518) 2022-11-23T02:11:25.0120683Z 488 STORE_FAST 11 (i) 2022-11-23T02:11:25.0120702Z 2022-11-23T02:11:25.0120868Z 1155 490 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0121003Z 492 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0121117Z 494 BINARY_SUBSCR 2022-11-23T02:11:25.0121235Z 496 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0121361Z 498 IS_OP 0 2022-11-23T02:11:25.0121492Z 500 EXTENDED_ARG 1 2022-11-23T02:11:25.0121638Z 502 POP_JUMP_IF_FALSE 258 (to 516) 2022-11-23T02:11:25.0121657Z 2022-11-23T02:11:25.0121829Z 1156 504 LOAD_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:25.0121968Z 506 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0122087Z 508 BINARY_SUBSCR 2022-11-23T02:11:25.0122229Z 510 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0122362Z 512 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0122477Z 514 STORE_SUBSCR 2022-11-23T02:11:25.0122625Z >> 516 JUMP_ABSOLUTE 243 (to 486) 2022-11-23T02:11:25.0122644Z 2022-11-23T02:11:25.0122816Z 1159 >> 518 LOAD_GLOBAL 40 (_tree_unflatten_with_rref) 2022-11-23T02:11:25.0122836Z 2022-11-23T02:11:25.0122994Z 1160 520 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0123136Z 522 LOAD_FAST 8 (treespec) 2022-11-23T02:11:25.0123290Z 524 LOAD_FAST 9 (output_is_rref) 2022-11-23T02:11:25.0123310Z 2022-11-23T02:11:25.0123440Z 1159 526 CALL_FUNCTION 3 2022-11-23T02:11:25.0123630Z 528 STORE_FAST 5 (output) 2022-11-23T02:11:25.0123650Z 2022-11-23T02:11:25.0123785Z 1162 >> 530 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0123900Z 532 RETURN_VALUE 2022-11-23T02:11:25.0123995Z 2022-11-23T02:11:25.0124015Z 2022-11-23T02:11:25.0124491Z [2022-11-23 02:11:13,651] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:25.0124798Z [2022-11-23 02:11:13,651] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:25.0125299Z [2022-11-23 02:11:13,651] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR autograd [TorchVariable()] 2022-11-23T02:11:25.0125834Z [2022-11-23 02:11:13,652] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR profiler [TorchVariable()] 2022-11-23T02:11:25.0126419Z [2022-11-23 02:11:13,652] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR record_function [TorchVariable()] 2022-11-23T02:11:25.0126926Z [2022-11-23 02:11:13,652] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1058 2022-11-23T02:11:25.0127444Z [2022-11-23 02:11:13,652] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST DistributedDataParallel.forward [TorchVariable()] 2022-11-23T02:11:25.0127896Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:25.0128387Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(), ConstantVariable(str)] 2022-11-23T02:11:25.0128700Z [2022-11-23 02:11:13,653] torch._dynamo.variables.torch: [WARNING] Profiler will be ignored 2022-11-23T02:11:25.0129046Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE SETUP_WITH 308 [NullContextVariable()] 2022-11-23T02:11:25.0129468Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_TOP None [WithExitFunctionVariable(), ConstantVariable(NoneType)] 2022-11-23T02:11:25.0129924Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1060 2022-11-23T02:11:25.0130290Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [WithExitFunctionVariable()] 2022-11-23T02:11:25.0130870Z [2022-11-23 02:11:13,653] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_grad_enabled [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:25.0131338Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:25.0131768Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0132113Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0132660Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR require_backward_grad_sync [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0133085Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0133600Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1061 2022-11-23T02:11:25.0133962Z [2022-11-23 02:11:13,654] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0134473Z [2022-11-23 02:11:13,655] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0134918Z [2022-11-23 02:11:13,655] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:25.0135403Z [2022-11-23 02:11:13,656] torch._dynamo.symbolic_convert: [DEBUG] TRACE IS_OP 1 [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger), ConstantVariable(NoneType)] 2022-11-23T02:11:25.0135826Z [2022-11-23 02:11:13,656] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 44 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0136278Z [2022-11-23 02:11:13,656] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1062 2022-11-23T02:11:25.0136638Z [2022-11-23 02:11:13,656] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0137197Z [2022-11-23 02:11:13,656] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0137670Z [2022-11-23 02:11:13,656] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR set_runtime_stats_and_log [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:25.0138131Z [2022-11-23 02:11:13,657] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), UserDefinedObjectVariable(instancemethod)] 2022-11-23T02:11:25.0138592Z [2022-11-23 02:11:13,658] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 379 2022-11-23T02:11:25.0138741Z 381 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:25.0138886Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:25.0139022Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0139141Z 6 RETURN_VALUE 2022-11-23T02:11:25.0139162Z 2022-11-23T02:11:25.0139258Z 2022-11-23T02:11:25.0139717Z [2022-11-23 02:11:13,658] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 379 2022-11-23T02:11:25.0139841Z 379 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:25.0139978Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:25.0139997Z 2022-11-23T02:11:25.0140129Z 381 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0140243Z 6 RETURN_VALUE 2022-11-23T02:11:25.0140266Z 2022-11-23T02:11:25.0140362Z 2022-11-23T02:11:25.0140623Z [2022-11-23 02:11:13,659] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0140735Z - 2022-11-23T02:11:25.0140893Z local 'ddp_m' TYPE_MATCH 2022-11-23T02:11:25.0140994Z { 2022-11-23T02:11:25.0141193Z 'guard_types': ['TYPE_MATCH'], 2022-11-23T02:11:25.0141437Z 'code': ['___check_type_id(ddp_m, 94883284555664)'], 2022-11-23T02:11:25.0141788Z 'obj_weakref': 2022-11-23T02:11:25.0142158Z 'guarded_class': 2022-11-23T02:11:25.0142262Z } 2022-11-23T02:11:25.0142361Z 2022-11-23T02:11:25.0142454Z - 2022-11-23T02:11:25.0142635Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:25.0142733Z { 2022-11-23T02:11:25.0142936Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0143157Z 'code': None, 2022-11-23T02:11:25.0143447Z 'obj_weakref': 2022-11-23T02:11:25.0143788Z 'guarded_class': 2022-11-23T02:11:25.0143869Z } 2022-11-23T02:11:25.0143971Z 2022-11-23T02:11:25.0144309Z [2022-11-23 02:11:13,660] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward 2022-11-23T02:11:25.0144745Z [2022-11-23 02:11:13,661] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:54 2022-11-23T02:11:25.0145042Z [2022-11-23 02:11:13,661] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0145373Z [2022-11-23 02:11:13,661] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR net [NNModuleVariable()] 2022-11-23T02:11:25.0145720Z [2022-11-23 02:11:13,661] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [NNModuleVariable()] 2022-11-23T02:11:25.0146096Z [2022-11-23 02:11:13,661] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0146433Z [2022-11-23 02:11:13,696] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0146814Z [2022-11-23 02:11:13,696] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward 2022-11-23T02:11:25.0147158Z [2022-11-23 02:11:13,698] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function compile_fn 2022-11-23T02:11:25.0147592Z [2022-11-23 02:11:13,698] torch._dynamo.optimizations.distributed: [INFO] DDPOptimizer used bucket cap 26214400 and produced the following buckets: 2022-11-23T02:11:25.0148025Z [2022-11-23 02:11:13,698] torch._dynamo.optimizations.distributed: [INFO] Please `pip install tabulate` in order to pretty-print ddp bucket sizes 2022-11-23T02:11:25.0148320Z [2022-11-23 02:11:13,701] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0148470Z ---orig graph--- 2022-11-23T02:11:25.0148575Z graph(): 2022-11-23T02:11:25.0148767Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:25.0148966Z %self_net_0 : [#users=1] = call_module[target=self_net_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:25.0149180Z %self_net_1 : [#users=1] = call_module[target=self_net_1](args = (%self_net_0,), kwargs = {}) 2022-11-23T02:11:25.0149385Z %self_net_2 : [#users=1] = call_module[target=self_net_2](args = (%self_net_1,), kwargs = {}) 2022-11-23T02:11:25.0149585Z %self_net_3 : [#users=1] = call_module[target=self_net_3](args = (%self_net_2,), kwargs = {}) 2022-11-23T02:11:25.0149783Z %self_net_4 : [#users=1] = call_module[target=self_net_4](args = (%self_net_3,), kwargs = {}) 2022-11-23T02:11:25.0149978Z %self_net_5 : [#users=1] = call_module[target=self_net_5](args = (%self_net_4,), kwargs = {}) 2022-11-23T02:11:25.0150179Z %self_net_6 : [#users=1] = call_module[target=self_net_6](args = (%self_net_5,), kwargs = {}) 2022-11-23T02:11:25.0150375Z %self_net_7 : [#users=1] = call_module[target=self_net_7](args = (%self_net_6,), kwargs = {}) 2022-11-23T02:11:25.0150476Z return (self_net_7,) 2022-11-23T02:11:25.0150517Z 2022-11-23T02:11:25.0150653Z ---split graph--- 2022-11-23T02:11:25.0150757Z graph(): 2022-11-23T02:11:25.0150948Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:25.0151152Z %submod_0 : [#users=1] = call_module[target=submod_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:25.0151357Z %submod_1 : [#users=1] = call_module[target=submod_1](args = (%submod_0,), kwargs = {}) 2022-11-23T02:11:25.0151552Z %submod_2 : [#users=1] = call_module[target=submod_2](args = (%submod_1,), kwargs = {}) 2022-11-23T02:11:25.0151667Z return (submod_2,) 2022-11-23T02:11:25.0151686Z 2022-11-23T02:11:25.0151894Z ---submod_0 graph--- 2022-11-23T02:11:25.0151995Z graph(): 2022-11-23T02:11:25.0152181Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:25.0152390Z %self_net_0 : [#users=1] = call_module[target=self_net_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:25.0152600Z %self_net_1 : [#users=1] = call_module[target=self_net_1](args = (%self_net_0,), kwargs = {}) 2022-11-23T02:11:25.0152721Z return self_net_1 2022-11-23T02:11:25.0152741Z 2022-11-23T02:11:25.0152899Z ---submod_1 graph--- 2022-11-23T02:11:25.0153001Z graph(): 2022-11-23T02:11:25.0153157Z %self_net_1 : [#users=1] = placeholder[target=self_net_1] 2022-11-23T02:11:25.0153367Z %self_net_2 : [#users=1] = call_module[target=self_net_2](args = (%self_net_1,), kwargs = {}) 2022-11-23T02:11:25.0153569Z %self_net_3 : [#users=1] = call_module[target=self_net_3](args = (%self_net_2,), kwargs = {}) 2022-11-23T02:11:25.0153682Z return self_net_3 2022-11-23T02:11:25.0153701Z 2022-11-23T02:11:25.0153859Z ---submod_2 graph--- 2022-11-23T02:11:25.0153961Z graph(): 2022-11-23T02:11:25.0154137Z %self_net_3 : [#users=1] = placeholder[target=self_net_3] 2022-11-23T02:11:25.0154337Z %self_net_4 : [#users=1] = call_module[target=self_net_4](args = (%self_net_3,), kwargs = {}) 2022-11-23T02:11:25.0154517Z %self_net_5 : [#users=1] = call_module[target=self_net_5](args = (%self_net_4,), kwargs = {}) 2022-11-23T02:11:25.0154768Z %self_net_6 : [#users=1] = call_module[target=self_net_6](args = (%self_net_5,), kwargs = {}) 2022-11-23T02:11:25.0154974Z %self_net_7 : [#users=1] = call_module[target=self_net_7](args = (%self_net_6,), kwargs = {}) 2022-11-23T02:11:25.0155238Z return self_net_7 2022-11-23T02:11:25.0155259Z 2022-11-23T02:11:25.0155407Z --------------- 2022-11-23T02:11:25.0155426Z 2022-11-23T02:11:25.0155819Z [2022-11-23 02:11:13,702] torch._dynamo.optimizations.distributed: [DEBUG] run_node placeholder, inputs got args tuple() 2022-11-23T02:11:25.0156250Z [2022-11-23 02:11:13,702] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_0 got args tuple(T[torch.Size([20, 10])]) 2022-11-23T02:11:25.0156547Z [2022-11-23 02:11:13,702] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0156683Z ---submod_0 graph--- 2022-11-23T02:11:25.0156785Z graph(): 2022-11-23T02:11:25.0156973Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:25.0157189Z %self_net_0 : [#users=1] = call_module[target=self_net_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:25.0157400Z %self_net_1 : [#users=1] = call_module[target=self_net_1](args = (%self_net_0,), kwargs = {}) 2022-11-23T02:11:25.0157514Z return self_net_1 2022-11-23T02:11:25.0157941Z [2022-11-23 02:11:13,702] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_1 got args tuple(T[torch.Size([20, 5000])]) 2022-11-23T02:11:25.0158231Z [2022-11-23 02:11:13,703] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0158371Z ---submod_1 graph--- 2022-11-23T02:11:25.0158473Z graph(): 2022-11-23T02:11:25.0158647Z %self_net_1 : [#users=1] = placeholder[target=self_net_1] 2022-11-23T02:11:25.0158857Z %self_net_2 : [#users=1] = call_module[target=self_net_2](args = (%self_net_1,), kwargs = {}) 2022-11-23T02:11:25.0159060Z %self_net_3 : [#users=1] = call_module[target=self_net_3](args = (%self_net_2,), kwargs = {}) 2022-11-23T02:11:25.0159179Z return self_net_3 2022-11-23T02:11:25.0159610Z [2022-11-23 02:11:13,703] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_2 got args tuple(T[torch.Size([20, 5000])]) 2022-11-23T02:11:25.0159900Z [2022-11-23 02:11:13,703] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0160037Z ---submod_2 graph--- 2022-11-23T02:11:25.0160141Z graph(): 2022-11-23T02:11:25.0160315Z %self_net_3 : [#users=1] = placeholder[target=self_net_3] 2022-11-23T02:11:25.0160525Z %self_net_4 : [#users=1] = call_module[target=self_net_4](args = (%self_net_3,), kwargs = {}) 2022-11-23T02:11:25.0160833Z %self_net_5 : [#users=1] = call_module[target=self_net_5](args = (%self_net_4,), kwargs = {}) 2022-11-23T02:11:25.0161038Z %self_net_6 : [#users=1] = call_module[target=self_net_6](args = (%self_net_5,), kwargs = {}) 2022-11-23T02:11:25.0161239Z %self_net_7 : [#users=1] = call_module[target=self_net_7](args = (%self_net_6,), kwargs = {}) 2022-11-23T02:11:25.0161359Z return self_net_7 2022-11-23T02:11:25.0161776Z [2022-11-23 02:11:13,704] torch._dynamo.optimizations.distributed: [DEBUG] run_node output, output got args tuple(tuple(T[torch.Size([20, 5])])) 2022-11-23T02:11:25.0162063Z [2022-11-23 02:11:13,705] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0162211Z ---final graph--- 2022-11-23T02:11:25.0162313Z graph(): 2022-11-23T02:11:25.0162502Z %inputs : torch.Tensor [#users=1] = placeholder[target=inputs] 2022-11-23T02:11:25.0162724Z %submod_0 : [#users=1] = call_module[target=compiled_submod_0](args = (%inputs,), kwargs = {}) 2022-11-23T02:11:25.0162951Z %submod_1 : [#users=1] = call_module[target=compiled_submod_1](args = (%submod_0,), kwargs = {}) 2022-11-23T02:11:25.0163175Z %submod_2 : [#users=1] = call_module[target=compiled_submod_2](args = (%submod_1,), kwargs = {}) 2022-11-23T02:11:25.0163272Z return (submod_2,) 2022-11-23T02:11:25.0163414Z --------------- 2022-11-23T02:11:25.0163435Z 2022-11-23T02:11:25.0163837Z [2022-11-23 02:11:13,705] torch._dynamo.output_graph: [INFO] Step 2: done compiler function compile_fn 2022-11-23T02:11:25.0164120Z [2022-11-23 02:11:13,705] torch._dynamo.output_graph: [CODE] TRACED GRAPH 2022-11-23T02:11:25.0164312Z __compiled_fn_3 .43 opcode, name, target, args, kwargs 2022-11-23T02:11:25.0164457Z placeholder, inputs, inputs, (), {} 2022-11-23T02:11:25.0164618Z call_module, self_net_0, self_net_0, (inputs,), {} 2022-11-23T02:11:25.0164762Z call_module, self_net_1, self_net_1, (self_net_0,), {} 2022-11-23T02:11:25.0164922Z call_module, self_net_2, self_net_2, (self_net_1,), {} 2022-11-23T02:11:25.0165086Z call_module, self_net_3, self_net_3, (self_net_2,), {} 2022-11-23T02:11:25.0165245Z call_module, self_net_4, self_net_4, (self_net_3,), {} 2022-11-23T02:11:25.0165401Z call_module, self_net_5, self_net_5, (self_net_4,), {} 2022-11-23T02:11:25.0165556Z call_module, self_net_6, self_net_6, (self_net_5,), {} 2022-11-23T02:11:25.0165717Z call_module, self_net_7, self_net_7, (self_net_6,), {} 2022-11-23T02:11:25.0165871Z output, output, output, ((self_net_7,),), {} 2022-11-23T02:11:25.0165891Z 2022-11-23T02:11:25.0166341Z [2022-11-23 02:11:13,706] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:25.0166481Z 54 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0166622Z 2 LOAD_METHOD 0 (net) 2022-11-23T02:11:25.0166760Z 4 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0166895Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0167013Z 8 RETURN_VALUE 2022-11-23T02:11:25.0167033Z 2022-11-23T02:11:25.0167128Z 2022-11-23T02:11:25.0167593Z [2022-11-23 02:11:13,706] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:25.0167738Z 53 0 LOAD_GLOBAL 1 (__compiled_fn_3) 2022-11-23T02:11:25.0167880Z 2 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0168012Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0168148Z 6 UNPACK_SEQUENCE 1 2022-11-23T02:11:25.0168264Z 8 RETURN_VALUE 2022-11-23T02:11:25.0168283Z 2022-11-23T02:11:25.0168377Z 2022-11-23T02:11:25.0168639Z [2022-11-23 02:11:13,706] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0168730Z - 2022-11-23T02:11:25.0168903Z local 'self' NN_MODULE 2022-11-23T02:11:25.0169072Z { 2022-11-23T02:11:25.0169270Z 'guard_types': ['ID_MATCH'], 2022-11-23T02:11:25.0169506Z 'code': ['___check_obj_id(self, 140050757741360)'], 2022-11-23T02:11:25.0169801Z 'obj_weakref': 2022-11-23T02:11:25.0170119Z 'guarded_class': 2022-11-23T02:11:25.0170228Z } 2022-11-23T02:11:25.0170310Z 2022-11-23T02:11:25.0170420Z - 2022-11-23T02:11:25.0170607Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:25.0170707Z { 2022-11-23T02:11:25.0170912Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0171068Z 'code': None, 2022-11-23T02:11:25.0171342Z 'obj_weakref': 2022-11-23T02:11:25.0171684Z 'guarded_class': 2022-11-23T02:11:25.0171788Z } 2022-11-23T02:11:25.0171886Z 2022-11-23T02:11:25.0171996Z - 2022-11-23T02:11:25.0172202Z local_nn_module 'self.net' NN_MODULE 2022-11-23T02:11:25.0172304Z { 2022-11-23T02:11:25.0172460Z 'guard_types': None, 2022-11-23T02:11:25.0172615Z 'code': None, 2022-11-23T02:11:25.0172844Z 'obj_weakref': None 2022-11-23T02:11:25.0173030Z 'guarded_class': None 2022-11-23T02:11:25.0173128Z } 2022-11-23T02:11:25.0173225Z 2022-11-23T02:11:25.0173332Z - 2022-11-23T02:11:25.0173525Z local_nn_module 'self.net[0]' NN_MODULE 2022-11-23T02:11:25.0173623Z { 2022-11-23T02:11:25.0173795Z 'guard_types': None, 2022-11-23T02:11:25.0173948Z 'code': None, 2022-11-23T02:11:25.0174119Z 'obj_weakref': None 2022-11-23T02:11:25.0174292Z 'guarded_class': None 2022-11-23T02:11:25.0174395Z } 2022-11-23T02:11:25.0174474Z 2022-11-23T02:11:25.0174581Z - 2022-11-23T02:11:25.0174792Z local_nn_module 'self.net[1]' NN_MODULE 2022-11-23T02:11:25.0174891Z { 2022-11-23T02:11:25.0175062Z 'guard_types': None, 2022-11-23T02:11:25.0175217Z 'code': None, 2022-11-23T02:11:25.0175386Z 'obj_weakref': None 2022-11-23T02:11:25.0175541Z 'guarded_class': None 2022-11-23T02:11:25.0175640Z } 2022-11-23T02:11:25.0175737Z 2022-11-23T02:11:25.0175841Z - 2022-11-23T02:11:25.0176050Z local_nn_module 'self.net[2]' NN_MODULE 2022-11-23T02:11:25.0176150Z { 2022-11-23T02:11:25.0176301Z 'guard_types': None, 2022-11-23T02:11:25.0176454Z 'code': None, 2022-11-23T02:11:25.0176620Z 'obj_weakref': None 2022-11-23T02:11:25.0176791Z 'guarded_class': None 2022-11-23T02:11:25.0176897Z } 2022-11-23T02:11:25.0176995Z 2022-11-23T02:11:25.0177104Z - 2022-11-23T02:11:25.0177294Z local_nn_module 'self.net[3]' NN_MODULE 2022-11-23T02:11:25.0177394Z { 2022-11-23T02:11:25.0177565Z 'guard_types': None, 2022-11-23T02:11:25.0177720Z 'code': None, 2022-11-23T02:11:25.0177886Z 'obj_weakref': None 2022-11-23T02:11:25.0178059Z 'guarded_class': None 2022-11-23T02:11:25.0178157Z } 2022-11-23T02:11:25.0178236Z 2022-11-23T02:11:25.0178343Z - 2022-11-23T02:11:25.0178550Z local_nn_module 'self.net[4]' NN_MODULE 2022-11-23T02:11:25.0178648Z { 2022-11-23T02:11:25.0178819Z 'guard_types': None, 2022-11-23T02:11:25.0178973Z 'code': None, 2022-11-23T02:11:25.0179122Z 'obj_weakref': None 2022-11-23T02:11:25.0179295Z 'guarded_class': None 2022-11-23T02:11:25.0179395Z } 2022-11-23T02:11:25.0179562Z 2022-11-23T02:11:25.0179673Z - 2022-11-23T02:11:25.0179881Z local_nn_module 'self.net[5]' NN_MODULE 2022-11-23T02:11:25.0179980Z { 2022-11-23T02:11:25.0180132Z 'guard_types': None, 2022-11-23T02:11:25.0180286Z 'code': None, 2022-11-23T02:11:25.0180452Z 'obj_weakref': None 2022-11-23T02:11:25.0180625Z 'guarded_class': None 2022-11-23T02:11:25.0180725Z } 2022-11-23T02:11:25.0180822Z 2022-11-23T02:11:25.0180930Z - 2022-11-23T02:11:25.0181119Z local_nn_module 'self.net[6]' NN_MODULE 2022-11-23T02:11:25.0181218Z { 2022-11-23T02:11:25.0181385Z 'guard_types': None, 2022-11-23T02:11:25.0181537Z 'code': None, 2022-11-23T02:11:25.0181702Z 'obj_weakref': None 2022-11-23T02:11:25.0181875Z 'guarded_class': None 2022-11-23T02:11:25.0181955Z } 2022-11-23T02:11:25.0182059Z 2022-11-23T02:11:25.0182168Z - 2022-11-23T02:11:25.0182375Z local_nn_module 'self.net[7]' NN_MODULE 2022-11-23T02:11:25.0182475Z { 2022-11-23T02:11:25.0182646Z 'guard_types': None, 2022-11-23T02:11:25.0182798Z 'code': None, 2022-11-23T02:11:25.0182945Z 'obj_weakref': None 2022-11-23T02:11:25.0183116Z 'guarded_class': None 2022-11-23T02:11:25.0183270Z } 2022-11-23T02:11:25.0183376Z 2022-11-23T02:11:25.0183549Z frames [('total', 2), ('ok', 2)] 2022-11-23T02:11:25.0183859Z inline_call [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0183977Z unimplemented [] 2022-11-23T02:11:25.0184269Z graph_break [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0184545Z stats [('calls_captured', 8), ('fusions_possible', 7), ('unique_graphs', 1)] 2022-11-23T02:11:25.0184649Z ok (0.574s) 2022-11-23T02:11:25.0184830Z test_graph_split_inductor (__main__.TestDistributed) 2022-11-23T02:11:25.0185082Z Same as above, but using inductor backend. ... skip: Inductor+gpu needs triton and recent GPU arch (0.001s) 2022-11-23T02:11:25.0185256Z test_ignored_parameters (__main__.TestDistributed) 2022-11-23T02:11:25.0185778Z Verifies ddp graph-split logic ignores parameters marked to ignore on DDP module. ... [2022-11-23 02:11:13,735] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing opt_fn 2022-11-23T02:11:25.0186221Z [2022-11-23 02:11:13,735] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:498 2022-11-23T02:11:25.0186505Z [2022-11-23 02:11:13,735] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF ddp_m [] 2022-11-23T02:11:25.0186949Z [2022-11-23 02:11:13,735] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0187427Z [2022-11-23 02:11:13,735] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UnspecializedNNModuleVariable(DistributedDataParallel), TensorVariable()] 2022-11-23T02:11:25.0187974Z [2022-11-23 02:11:13,737] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0188123Z 1057 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0188269Z 2 LOAD_ATTR 1 (autograd) 2022-11-23T02:11:25.0188411Z 4 LOAD_ATTR 2 (profiler) 2022-11-23T02:11:25.0188567Z 6 LOAD_METHOD 3 (record_function) 2022-11-23T02:11:25.0188588Z 2022-11-23T02:11:25.0188882Z 1058 8 LOAD_CONST 1 ('DistributedDataParallel.forward') 2022-11-23T02:11:25.0188903Z 2022-11-23T02:11:25.0189038Z 1057 10 CALL_METHOD 1 2022-11-23T02:11:25.0189161Z 12 SETUP_WITH 147 (to 308) 2022-11-23T02:11:25.0189340Z 14 POP_TOP 2022-11-23T02:11:25.0189361Z 2022-11-23T02:11:25.0189500Z 1060 16 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0189653Z 18 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0189783Z 20 CALL_METHOD 0 2022-11-23T02:11:25.0189927Z 22 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:25.0190064Z 24 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0190285Z 26 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:25.0190431Z 28 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:25.0190451Z 2022-11-23T02:11:25.0190584Z 1061 30 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0190723Z 32 LOAD_ATTR 6 (logger) 2022-11-23T02:11:25.0190858Z 34 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0190985Z 36 IS_OP 1 2022-11-23T02:11:25.0191132Z 38 POP_JUMP_IF_TRUE 22 (to 44) 2022-11-23T02:11:25.0191260Z 40 LOAD_ASSERTION_ERROR 2022-11-23T02:11:25.0191376Z 42 RAISE_VARARGS 1 2022-11-23T02:11:25.0191396Z 2022-11-23T02:11:25.0191529Z 1062 >> 44 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0191665Z 46 LOAD_ATTR 6 (logger) 2022-11-23T02:11:25.0191889Z 48 LOAD_METHOD 7 (set_runtime_stats_and_log) 2022-11-23T02:11:25.0192027Z 50 CALL_METHOD 0 2022-11-23T02:11:25.0192136Z 52 POP_TOP 2022-11-23T02:11:25.0192156Z 2022-11-23T02:11:25.0192292Z 1063 54 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0192400Z 56 DUP_TOP 2022-11-23T02:11:25.0192537Z 58 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0192671Z 60 LOAD_CONST 2 (1) 2022-11-23T02:11:25.0192787Z 62 INPLACE_ADD 2022-11-23T02:11:25.0192895Z 64 ROT_TWO 2022-11-23T02:11:25.0193047Z 66 STORE_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0193067Z 2022-11-23T02:11:25.0193199Z 1064 68 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0193342Z 70 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0193495Z 72 LOAD_METHOD 10 (prepare_for_forward) 2022-11-23T02:11:25.0193631Z 74 CALL_METHOD 0 2022-11-23T02:11:25.0193741Z 76 POP_TOP 2022-11-23T02:11:25.0193760Z 2022-11-23T02:11:25.0193894Z 1068 >> 78 LOAD_GLOBAL 11 (Join) 2022-11-23T02:11:25.0194052Z 80 LOAD_METHOD 12 (notify_join_context) 2022-11-23T02:11:25.0194189Z 82 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0194319Z 84 CALL_METHOD 1 2022-11-23T02:11:25.0194454Z 86 STORE_FAST 3 (work) 2022-11-23T02:11:25.0194477Z 2022-11-23T02:11:25.0194592Z 1069 88 LOAD_FAST 3 (work) 2022-11-23T02:11:25.0194739Z 90 POP_JUMP_IF_FALSE 54 (to 108) 2022-11-23T02:11:25.0194758Z 2022-11-23T02:11:25.0194892Z 1070 92 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0195189Z 94 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0195385Z 96 LOAD_METHOD 13 (_set_forward_pass_work_handle) 2022-11-23T02:11:25.0195406Z 2022-11-23T02:11:25.0195537Z 1071 98 LOAD_FAST 3 (work) 2022-11-23T02:11:25.0195702Z 100 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0195873Z 102 LOAD_ATTR 14 (_divide_by_initial_world_size) 2022-11-23T02:11:25.0195893Z 2022-11-23T02:11:25.0196006Z 1070 104 CALL_METHOD 2 2022-11-23T02:11:25.0196118Z 106 POP_TOP 2022-11-23T02:11:25.0196137Z 2022-11-23T02:11:25.0196275Z 1080 >> 108 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0196570Z 110 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0196703Z 112 CALL_METHOD 0 2022-11-23T02:11:25.0196855Z 114 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:25.0196986Z 116 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0197128Z 118 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0197267Z 120 LOAD_METHOD 15 (_rebuild_buckets) 2022-11-23T02:11:25.0197397Z 122 CALL_METHOD 0 2022-11-23T02:11:25.0197542Z 124 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:25.0197561Z 2022-11-23T02:11:25.0197696Z 1081 126 LOAD_GLOBAL 6 (logger) 2022-11-23T02:11:25.0197830Z 128 LOAD_METHOD 16 (info) 2022-11-23T02:11:25.0197849Z 2022-11-23T02:11:25.0198175Z 1082 130 LOAD_CONST 3 ('Reducer buckets have been rebuilt in this iteration.') 2022-11-23T02:11:25.0198200Z 2022-11-23T02:11:25.0198332Z 1081 132 CALL_METHOD 1 2022-11-23T02:11:25.0198440Z 134 POP_TOP 2022-11-23T02:11:25.0198459Z 2022-11-23T02:11:25.0198596Z 1084 136 LOAD_CONST 4 (True) 2022-11-23T02:11:25.0198712Z 138 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0198945Z 140 STORE_ATTR 17 (_has_rebuilt_buckets) 2022-11-23T02:11:25.0198967Z 2022-11-23T02:11:25.0199121Z 1088 >> 142 LOAD_GLOBAL 18 (hasattr) 2022-11-23T02:11:25.0199253Z 144 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0199479Z 146 LOAD_CONST 5 ('buffer_hook') 2022-11-23T02:11:25.0199614Z 148 CALL_FUNCTION 2 2022-11-23T02:11:25.0199781Z 150 STORE_FAST 4 (buffer_hook_registered) 2022-11-23T02:11:25.0199801Z 2022-11-23T02:11:25.0199933Z 1089 152 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0200090Z 154 LOAD_METHOD 19 (_check_sync_bufs_pre_fwd) 2022-11-23T02:11:25.0200224Z 156 CALL_METHOD 0 2022-11-23T02:11:25.0200373Z 158 POP_JUMP_IF_FALSE 84 (to 168) 2022-11-23T02:11:25.0200392Z 2022-11-23T02:11:25.0200525Z 1090 160 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0200679Z 162 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:25.0200810Z 164 CALL_METHOD 0 2022-11-23T02:11:25.0200919Z 166 POP_TOP 2022-11-23T02:11:25.0200938Z 2022-11-23T02:11:25.0201054Z 1092 >> 168 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0201203Z 170 LOAD_ATTR 21 (_join_config) 2022-11-23T02:11:25.0201338Z 172 LOAD_ATTR 22 (enable) 2022-11-23T02:11:25.0201485Z 174 POP_JUMP_IF_FALSE 94 (to 188) 2022-11-23T02:11:25.0201504Z 2022-11-23T02:11:25.0201636Z 1094 176 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0201833Z 178 LOAD_ATTR 23 (_check_global_requires_backward_grad_sync) 2022-11-23T02:11:25.0201852Z 2022-11-23T02:11:25.0201991Z 1095 180 LOAD_CONST 6 (False) 2022-11-23T02:11:25.0202010Z 2022-11-23T02:11:25.0202248Z 1094 182 LOAD_CONST 7 (('is_joined_rank',)) 2022-11-23T02:11:25.0202389Z 184 CALL_FUNCTION_KW 1 2022-11-23T02:11:25.0202480Z 186 POP_TOP 2022-11-23T02:11:25.0202499Z 2022-11-23T02:11:25.0202630Z 1098 >> 188 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0202783Z 190 LOAD_ATTR 24 (_run_ddp_forward) 2022-11-23T02:11:25.0202917Z 192 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0203047Z 194 BUILD_MAP 0 2022-11-23T02:11:25.0203180Z 196 LOAD_FAST 2 (kwargs) 2022-11-23T02:11:25.0203307Z 198 DICT_MERGE 1 2022-11-23T02:11:25.0203503Z 200 CALL_FUNCTION_EX 1 2022-11-23T02:11:25.0203641Z 202 STORE_FAST 5 (output) 2022-11-23T02:11:25.0203661Z 2022-11-23T02:11:25.0203793Z 1102 204 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0203963Z 206 LOAD_METHOD 25 (_check_sync_bufs_post_fwd) 2022-11-23T02:11:25.0204095Z 208 CALL_METHOD 0 2022-11-23T02:11:25.0204243Z 210 POP_JUMP_IF_FALSE 110 (to 220) 2022-11-23T02:11:25.0204262Z 2022-11-23T02:11:25.0204396Z 1103 212 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0204548Z 214 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:25.0204659Z 216 CALL_METHOD 0 2022-11-23T02:11:25.0204769Z 218 POP_TOP 2022-11-23T02:11:25.0204789Z 2022-11-23T02:11:25.0204930Z 1105 >> 220 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0205084Z 222 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0205220Z 224 CALL_METHOD 0 2022-11-23T02:11:25.0205367Z 226 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:25.0205499Z 228 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0205671Z 230 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:25.0205798Z 232 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:25.0205887Z 2022-11-23T02:11:25.0206035Z 1106 234 LOAD_CONST 4 (True) 2022-11-23T02:11:25.0206166Z 236 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0206337Z 238 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:25.0206357Z 2022-11-23T02:11:25.0206487Z 1112 240 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0206653Z 242 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:25.0206797Z 244 POP_JUMP_IF_FALSE 137 (to 274) 2022-11-23T02:11:25.0206934Z 246 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0207066Z 248 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0207209Z 250 POP_JUMP_IF_TRUE 137 (to 274) 2022-11-23T02:11:25.0207229Z 2022-11-23T02:11:25.0207359Z 1114 252 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0207498Z 254 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0207668Z 256 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:25.0207688Z 2022-11-23T02:11:25.0207822Z 1115 258 LOAD_GLOBAL 30 (list) 2022-11-23T02:11:25.0207973Z 260 LOAD_GLOBAL 31 (_find_tensors) 2022-11-23T02:11:25.0208114Z 262 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0208231Z 264 CALL_FUNCTION 1 2022-11-23T02:11:25.0208364Z 266 CALL_FUNCTION 1 2022-11-23T02:11:25.0208383Z 2022-11-23T02:11:25.0208511Z 1114 268 CALL_METHOD 1 2022-11-23T02:11:25.0208625Z 270 POP_TOP 2022-11-23T02:11:25.0208769Z 272 JUMP_FORWARD 10 (to 294) 2022-11-23T02:11:25.0208789Z 2022-11-23T02:11:25.0208923Z 1118 >> 274 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0209062Z 276 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0209230Z 278 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:25.0209343Z 280 BUILD_LIST 0 2022-11-23T02:11:25.0209472Z 282 CALL_METHOD 1 2022-11-23T02:11:25.0209580Z 284 POP_TOP 2022-11-23T02:11:25.0209722Z 286 JUMP_FORWARD 3 (to 294) 2022-11-23T02:11:25.0209741Z 2022-11-23T02:11:25.0209877Z 1120 >> 288 LOAD_CONST 6 (False) 2022-11-23T02:11:25.0210012Z 290 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0210189Z 292 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:25.0210353Z >> 294 POP_BLOCK 2022-11-23T02:11:25.0210390Z 2022-11-23T02:11:25.0210507Z 1057 296 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0210615Z 298 DUP_TOP 2022-11-23T02:11:25.0210719Z 300 DUP_TOP 2022-11-23T02:11:25.0210850Z 302 CALL_FUNCTION 3 2022-11-23T02:11:25.0210957Z 304 POP_TOP 2022-11-23T02:11:25.0211103Z 306 JUMP_FORWARD 8 (to 324) 2022-11-23T02:11:25.0211211Z >> 308 WITH_EXCEPT_START 2022-11-23T02:11:25.0211356Z 310 POP_JUMP_IF_TRUE 157 (to 314) 2022-11-23T02:11:25.0211484Z 312 RERAISE 1 2022-11-23T02:11:25.0211591Z >> 314 POP_TOP 2022-11-23T02:11:25.0211698Z 316 POP_TOP 2022-11-23T02:11:25.0211803Z 318 POP_TOP 2022-11-23T02:11:25.0211917Z 320 POP_EXCEPT 2022-11-23T02:11:25.0212004Z 322 POP_TOP 2022-11-23T02:11:25.0212023Z 2022-11-23T02:11:25.0212161Z 1124 >> 324 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0212330Z 326 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:25.0212475Z 328 POP_JUMP_IF_FALSE 168 (to 336) 2022-11-23T02:11:25.0212607Z 330 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0212757Z 332 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0212956Z 334 POP_JUMP_IF_FALSE 178 (to 356) 2022-11-23T02:11:25.0212977Z 2022-11-23T02:11:25.0213119Z 1125 >> 336 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0213250Z 338 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0213287Z 2022-11-23T02:11:25.0213402Z 1124 340 EXTENDED_ARG 1 2022-11-23T02:11:25.0213547Z 342 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:25.0213566Z 2022-11-23T02:11:25.0213699Z 1125 344 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0213858Z 346 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0213993Z 348 LOAD_CONST 2 (1) 2022-11-23T02:11:25.0214130Z 350 COMPARE_OP 2 (==) 2022-11-23T02:11:25.0214149Z 2022-11-23T02:11:25.0214283Z 1124 352 EXTENDED_ARG 1 2022-11-23T02:11:25.0214411Z 354 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:25.0214448Z 2022-11-23T02:11:25.0214567Z 1128 >> 356 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0214716Z 358 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0214736Z 2022-11-23T02:11:25.0214867Z 1129 360 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0215016Z 362 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0215036Z 2022-11-23T02:11:25.0215311Z 1127 364 LOAD_CONST 8 (('static_graph', 'num_iterations')) 2022-11-23T02:11:25.0215452Z 366 BUILD_CONST_KEY_MAP 2 2022-11-23T02:11:25.0215603Z 368 STORE_FAST 6 (state_dict) 2022-11-23T02:11:25.0215622Z 2022-11-23T02:11:25.0215790Z 1136 370 LOAD_GLOBAL 32 (_tree_flatten_with_rref) 2022-11-23T02:11:25.0215909Z 372 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0216042Z 374 CALL_FUNCTION 1 2022-11-23T02:11:25.0216061Z 2022-11-23T02:11:25.0216201Z 1132 376 UNPACK_SEQUENCE 3 2022-11-23T02:11:25.0216220Z 2022-11-23T02:11:25.0216380Z 1133 378 STORE_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0216399Z 2022-11-23T02:11:25.0216543Z 1134 380 STORE_FAST 8 (treespec) 2022-11-23T02:11:25.0216563Z 2022-11-23T02:11:25.0216714Z 1135 382 STORE_FAST 9 (output_is_rref) 2022-11-23T02:11:25.0216734Z 2022-11-23T02:11:25.0217224Z 1137 384 LOAD_CONST 9 ( at 0x7f6052ac4660, file "/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1137>) 2022-11-23T02:11:25.0217633Z 386 LOAD_CONST 10 ('DistributedDataParallel.forward..') 2022-11-23T02:11:25.0217773Z 388 MAKE_FUNCTION 0 2022-11-23T02:11:25.0217893Z 390 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:25.0218031Z 392 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:25.0218193Z 394 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0218326Z 396 CALL_FUNCTION 1 2022-11-23T02:11:25.0218462Z 398 CALL_FUNCTION 1 2022-11-23T02:11:25.0218571Z 400 GET_ITER 2022-11-23T02:11:25.0218703Z 402 CALL_FUNCTION 1 2022-11-23T02:11:25.0218871Z 404 STORE_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0218891Z 2022-11-23T02:11:25.0219017Z 1140 406 LOAD_GLOBAL 35 (enumerate) 2022-11-23T02:11:25.0219174Z 408 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0219313Z 410 CALL_FUNCTION 1 2022-11-23T02:11:25.0219422Z 412 GET_ITER 2022-11-23T02:11:25.0219558Z >> 414 FOR_ITER 18 (to 452) 2022-11-23T02:11:25.0219693Z 416 UNPACK_SEQUENCE 2 2022-11-23T02:11:25.0219827Z 418 STORE_FAST 11 (i) 2022-11-23T02:11:25.0220002Z 420 STORE_FAST 5 (output) 2022-11-23T02:11:25.0220043Z 2022-11-23T02:11:25.0220172Z 1141 422 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0220315Z 424 LOAD_METHOD 36 (is_tensor) 2022-11-23T02:11:25.0220452Z 426 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0220581Z 428 CALL_METHOD 1 2022-11-23T02:11:25.0220729Z 430 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:25.0220864Z 432 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0221006Z 434 LOAD_ATTR 37 (grad_fn) 2022-11-23T02:11:25.0221125Z 436 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0221254Z 438 IS_OP 0 2022-11-23T02:11:25.0221398Z 440 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:25.0221418Z 2022-11-23T02:11:25.0221553Z 1142 442 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0221719Z 444 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0221854Z 446 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0221966Z 448 STORE_SUBSCR 2022-11-23T02:11:25.0222089Z >> 450 JUMP_ABSOLUTE 207 (to 414) 2022-11-23T02:11:25.0222121Z 2022-11-23T02:11:25.0222247Z 1149 >> 452 LOAD_GLOBAL 38 (_DDPSink) 2022-11-23T02:11:25.0222375Z 454 LOAD_ATTR 39 (apply) 2022-11-23T02:11:25.0222395Z 2022-11-23T02:11:25.0222527Z 1150 456 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0222669Z 458 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0222689Z 2022-11-23T02:11:25.0222832Z 1151 460 LOAD_FAST 6 (state_dict) 2022-11-23T02:11:25.0222852Z 2022-11-23T02:11:25.0222981Z 1149 462 BUILD_LIST 2 2022-11-23T02:11:25.0223000Z 2022-11-23T02:11:25.0223155Z 1152 464 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0223178Z 2022-11-23T02:11:25.0223309Z 1149 466 LIST_EXTEND 1 2022-11-23T02:11:25.0223407Z 468 LIST_TO_TUPLE 2022-11-23T02:11:25.0223540Z 470 CALL_FUNCTION_EX 0 2022-11-23T02:11:25.0223710Z 472 STORE_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:25.0223730Z 2022-11-23T02:11:25.0223860Z 1154 474 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:25.0223993Z 476 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:25.0224153Z 478 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0224349Z 480 CALL_FUNCTION 1 2022-11-23T02:11:25.0224479Z 482 CALL_FUNCTION 1 2022-11-23T02:11:25.0224569Z 484 GET_ITER 2022-11-23T02:11:25.0224701Z >> 486 FOR_ITER 15 (to 518) 2022-11-23T02:11:25.0224836Z 488 STORE_FAST 11 (i) 2022-11-23T02:11:25.0224855Z 2022-11-23T02:11:25.0225020Z 1155 490 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0225147Z 492 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0225261Z 494 BINARY_SUBSCR 2022-11-23T02:11:25.0225394Z 496 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0225518Z 498 IS_OP 0 2022-11-23T02:11:25.0225633Z 500 EXTENDED_ARG 1 2022-11-23T02:11:25.0225774Z 502 POP_JUMP_IF_FALSE 258 (to 516) 2022-11-23T02:11:25.0225793Z 2022-11-23T02:11:25.0225963Z 1156 504 LOAD_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:25.0226102Z 506 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0226216Z 508 BINARY_SUBSCR 2022-11-23T02:11:25.0226377Z 510 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0226505Z 512 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0226601Z 514 STORE_SUBSCR 2022-11-23T02:11:25.0226796Z >> 516 JUMP_ABSOLUTE 243 (to 486) 2022-11-23T02:11:25.0226818Z 2022-11-23T02:11:25.0226994Z 1159 >> 518 LOAD_GLOBAL 40 (_tree_unflatten_with_rref) 2022-11-23T02:11:25.0227013Z 2022-11-23T02:11:25.0227171Z 1160 520 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0227306Z 522 LOAD_FAST 8 (treespec) 2022-11-23T02:11:25.0227457Z 524 LOAD_FAST 9 (output_is_rref) 2022-11-23T02:11:25.0227477Z 2022-11-23T02:11:25.0227605Z 1159 526 CALL_FUNCTION 3 2022-11-23T02:11:25.0227744Z 528 STORE_FAST 5 (output) 2022-11-23T02:11:25.0227764Z 2022-11-23T02:11:25.0227900Z 1162 >> 530 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0227996Z 532 RETURN_VALUE 2022-11-23T02:11:25.0228089Z 2022-11-23T02:11:25.0228108Z 2022-11-23T02:11:25.0228580Z [2022-11-23 02:11:13,738] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:25.0228880Z [2022-11-23 02:11:13,739] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:25.0229375Z [2022-11-23 02:11:13,739] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR autograd [TorchVariable()] 2022-11-23T02:11:25.0229908Z [2022-11-23 02:11:13,739] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR profiler [TorchVariable()] 2022-11-23T02:11:25.0230485Z [2022-11-23 02:11:13,739] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR record_function [TorchVariable()] 2022-11-23T02:11:25.0230933Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1058 2022-11-23T02:11:25.0231456Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST DistributedDataParallel.forward [TorchVariable()] 2022-11-23T02:11:25.0231902Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:25.0232395Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(), ConstantVariable(str)] 2022-11-23T02:11:25.0232759Z [2022-11-23 02:11:13,740] torch._dynamo.variables.torch: [WARNING] Profiler will be ignored 2022-11-23T02:11:25.0233103Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE SETUP_WITH 308 [NullContextVariable()] 2022-11-23T02:11:25.0233520Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_TOP None [WithExitFunctionVariable(), ConstantVariable(NoneType)] 2022-11-23T02:11:25.0233969Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1060 2022-11-23T02:11:25.0234333Z [2022-11-23 02:11:13,740] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [WithExitFunctionVariable()] 2022-11-23T02:11:25.0234910Z [2022-11-23 02:11:13,741] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_grad_enabled [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:25.0235602Z [2022-11-23 02:11:13,741] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:25.0236153Z [2022-11-23 02:11:13,741] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0236579Z [2022-11-23 02:11:13,741] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0237169Z [2022-11-23 02:11:13,741] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR require_backward_grad_sync [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0237647Z [2022-11-23 02:11:13,742] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0238089Z [2022-11-23 02:11:13,742] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1061 2022-11-23T02:11:25.0238490Z [2022-11-23 02:11:13,742] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0239036Z [2022-11-23 02:11:13,742] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0239572Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:25.0240153Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE IS_OP 1 [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger), ConstantVariable(NoneType)] 2022-11-23T02:11:25.0240734Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 44 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0241303Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1062 2022-11-23T02:11:25.0241708Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0242203Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0242724Z [2022-11-23 02:11:13,743] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR set_runtime_stats_and_log [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:25.0243223Z [2022-11-23 02:11:13,744] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), UserDefinedObjectVariable(instancemethod)] 2022-11-23T02:11:25.0243816Z [2022-11-23 02:11:13,745] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 496 2022-11-23T02:11:25.0243994Z 498 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:25.0264042Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:25.0264187Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0264305Z 6 RETURN_VALUE 2022-11-23T02:11:25.0264328Z 2022-11-23T02:11:25.0264422Z 2022-11-23T02:11:25.0264961Z [2022-11-23 02:11:13,745] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 496 2022-11-23T02:11:25.0265109Z 496 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:25.0265253Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:25.0265274Z 2022-11-23T02:11:25.0265393Z 498 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0265506Z 6 RETURN_VALUE 2022-11-23T02:11:25.0265533Z 2022-11-23T02:11:25.0265618Z 2022-11-23T02:11:25.0265891Z [2022-11-23 02:11:13,746] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0266001Z - 2022-11-23T02:11:25.0266185Z local 'ddp_m' TYPE_MATCH 2022-11-23T02:11:25.0266285Z { 2022-11-23T02:11:25.0266478Z 'guard_types': ['TYPE_MATCH'], 2022-11-23T02:11:25.0266842Z 'code': ['___check_type_id(ddp_m, 94883284555664)'], 2022-11-23T02:11:25.0267227Z 'obj_weakref': 2022-11-23T02:11:25.0267614Z 'guarded_class': 2022-11-23T02:11:25.0267717Z } 2022-11-23T02:11:25.0267807Z 2022-11-23T02:11:25.0267913Z - 2022-11-23T02:11:25.0268104Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:25.0268187Z { 2022-11-23T02:11:25.0268409Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0268569Z 'code': None, 2022-11-23T02:11:25.0268863Z 'obj_weakref': 2022-11-23T02:11:25.0269219Z 'guarded_class': 2022-11-23T02:11:25.0269320Z } 2022-11-23T02:11:25.0269414Z 2022-11-23T02:11:25.0269755Z [2022-11-23 02:11:13,747] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward 2022-11-23T02:11:25.0270223Z [2022-11-23 02:11:13,747] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:92 2022-11-23T02:11:25.0270535Z [2022-11-23 02:11:13,747] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0270900Z [2022-11-23 02:11:13,747] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq [NNModuleVariable()] 2022-11-23T02:11:25.0271250Z [2022-11-23 02:11:13,748] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [NNModuleVariable()] 2022-11-23T02:11:25.0271653Z [2022-11-23 02:11:13,748] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0272216Z [2022-11-23 02:11:13,748] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0272359Z 78 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0272493Z 2 LOAD_METHOD 0 (linear) 2022-11-23T02:11:25.0272614Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0272747Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0272861Z 8 RETURN_VALUE 2022-11-23T02:11:25.0272948Z 2022-11-23T02:11:25.0272969Z 2022-11-23T02:11:25.0273437Z [2022-11-23 02:11:13,748] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:78 2022-11-23T02:11:25.0273864Z [2022-11-23 02:11:13,749] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0274216Z [2022-11-23 02:11:13,749] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR linear [NNModuleVariable()] 2022-11-23T02:11:25.0274571Z [2022-11-23 02:11:13,749] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [NNModuleVariable()] 2022-11-23T02:11:25.0274974Z [2022-11-23 02:11:13,749] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0275577Z [2022-11-23 02:11:13,756] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0276114Z [2022-11-23 02:11:13,756] torch._dynamo.symbolic_convert: [DEBUG] DONE INLINING 2022-11-23T02:11:25.0276642Z [2022-11-23 02:11:13,758] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0276776Z 70 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0276914Z 2 LOAD_METHOD 1 (mm) 2022-11-23T02:11:25.0277148Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0277289Z 6 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0277427Z 8 LOAD_ATTR 2 (weight) 2022-11-23T02:11:25.0277544Z 10 LOAD_METHOD 3 (t) 2022-11-23T02:11:25.0277673Z 12 CALL_METHOD 0 2022-11-23T02:11:25.0277802Z 14 CALL_METHOD 2 2022-11-23T02:11:25.0277908Z 16 RETURN_VALUE 2022-11-23T02:11:25.0277998Z 2022-11-23T02:11:25.0278019Z 2022-11-23T02:11:25.0278469Z [2022-11-23 02:11:13,759] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:70 2022-11-23T02:11:25.0278775Z [2022-11-23 02:11:13,759] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:25.0279263Z [2022-11-23 02:11:13,759] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR mm [TorchVariable()] 2022-11-23T02:11:25.0279680Z [2022-11-23 02:11:13,759] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [TorchVariable()] 2022-11-23T02:11:25.0280126Z [2022-11-23 02:11:13,759] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TorchVariable(), TensorVariable()] 2022-11-23T02:11:25.0280631Z [2022-11-23 02:11:13,759] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR weight [TorchVariable(), TensorVariable(), NNModuleVariable()] 2022-11-23T02:11:25.0281129Z [2022-11-23 02:11:13,761] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR t [TorchVariable(), TensorVariable(), TensorVariable()] 2022-11-23T02:11:25.0281685Z [2022-11-23 02:11:13,762] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [TorchVariable(), TensorVariable(), GetAttrVariable(TensorVariable(), t)] 2022-11-23T02:11:25.0282194Z [2022-11-23 02:11:13,763] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(), TensorVariable(), TensorVariable()] 2022-11-23T02:11:25.0282526Z [2022-11-23 02:11:13,764] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0283040Z [2022-11-23 02:11:13,764] torch._dynamo.symbolic_convert: [DEBUG] DONE INLINING 2022-11-23T02:11:25.0283644Z [2022-11-23 02:11:13,766] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0283790Z 78 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0283930Z 2 LOAD_METHOD 0 (linear) 2022-11-23T02:11:25.0284053Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0284166Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0284279Z 8 RETURN_VALUE 2022-11-23T02:11:25.0284374Z 2022-11-23T02:11:25.0284395Z 2022-11-23T02:11:25.0284831Z [2022-11-23 02:11:13,766] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:78 2022-11-23T02:11:25.0285124Z [2022-11-23 02:11:13,767] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0285460Z [2022-11-23 02:11:13,767] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR linear [NNModuleVariable()] 2022-11-23T02:11:25.0285789Z [2022-11-23 02:11:13,767] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x [NNModuleVariable()] 2022-11-23T02:11:25.0286226Z [2022-11-23 02:11:13,767] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0286578Z [2022-11-23 02:11:13,774] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0287085Z [2022-11-23 02:11:13,774] torch._dynamo.symbolic_convert: [DEBUG] DONE INLINING 2022-11-23T02:11:25.0287422Z [2022-11-23 02:11:13,776] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0287744Z [2022-11-23 02:11:13,776] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward 2022-11-23T02:11:25.0288076Z [2022-11-23 02:11:13,778] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function compile_fn 2022-11-23T02:11:25.0288513Z [2022-11-23 02:11:13,778] torch._dynamo.optimizations.distributed: [INFO] DDPOptimizer used bucket cap 26214400 and produced the following buckets: 2022-11-23T02:11:25.0288945Z [2022-11-23 02:11:13,778] torch._dynamo.optimizations.distributed: [INFO] Please `pip install tabulate` in order to pretty-print ddp bucket sizes 2022-11-23T02:11:25.0289220Z [2022-11-23 02:11:13,780] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0289367Z ---orig graph--- 2022-11-23T02:11:25.0289468Z graph(): 2022-11-23T02:11:25.0289625Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0289843Z %self_seq_0_linear : [#users=1] = call_module[target=self_seq_0_linear](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0290071Z %self_seq_1 : [#users=1] = call_module[target=self_seq_1](args = (%self_seq_0_linear,), kwargs = {}) 2022-11-23T02:11:25.0290298Z %self_seq_2_weight : [#users=1] = get_attr[target=self_seq_2_weight] 2022-11-23T02:11:25.0290501Z %t : [#users=1] = call_method[target=t](args = (%self_seq_2_weight,), kwargs = {}) 2022-11-23T02:11:25.0290708Z %mm : [#users=1] = call_function[target=torch.mm](args = (%self_seq_1, %t), kwargs = {}) 2022-11-23T02:11:25.0290899Z %self_seq_3 : [#users=1] = call_module[target=self_seq_3](args = (%mm,), kwargs = {}) 2022-11-23T02:11:25.0291130Z %self_seq_4_linear : [#users=1] = call_module[target=self_seq_4_linear](args = (%self_seq_3,), kwargs = {}) 2022-11-23T02:11:25.0291337Z %self_seq_5 : [#users=1] = call_module[target=self_seq_5](args = (%self_seq_4_linear,), kwargs = {}) 2022-11-23T02:11:25.0291455Z return (self_seq_5,) 2022-11-23T02:11:25.0291476Z 2022-11-23T02:11:25.0291631Z ---split graph--- 2022-11-23T02:11:25.0291805Z graph(): 2022-11-23T02:11:25.0291969Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0292153Z %self_seq_2_weight : [#users=1] = get_attr[target=self_seq_2_weight] 2022-11-23T02:11:25.0292374Z %submod_0 : [#users=1] = call_module[target=submod_0](args = (%x, %self_seq_2_weight), kwargs = {}) 2022-11-23T02:11:25.0292584Z %submod_1 : [#users=1] = call_module[target=submod_1](args = (%submod_0,), kwargs = {}) 2022-11-23T02:11:25.0292682Z return (submod_1,) 2022-11-23T02:11:25.0292702Z 2022-11-23T02:11:25.0292861Z ---submod_0 graph--- 2022-11-23T02:11:25.0292952Z graph(): 2022-11-23T02:11:25.0293117Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0293311Z %self_seq_2_weight : [#users=1] = placeholder[target=self_seq_2_weight] 2022-11-23T02:11:25.0293533Z %self_seq_0_linear : [#users=1] = call_module[target=self_seq_0_linear](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0293741Z %self_seq_1 : [#users=1] = call_module[target=self_seq_1](args = (%self_seq_0_linear,), kwargs = {}) 2022-11-23T02:11:25.0293945Z %t : [#users=1] = call_method[target=t](args = (%self_seq_2_weight,), kwargs = {}) 2022-11-23T02:11:25.0294133Z %mm : [#users=1] = call_function[target=torch.mm](args = (%self_seq_1, %t), kwargs = {}) 2022-11-23T02:11:25.0294333Z %self_seq_3 : [#users=1] = call_module[target=self_seq_3](args = (%mm,), kwargs = {}) 2022-11-23T02:11:25.0294500Z return self_seq_3 2022-11-23T02:11:25.0294522Z 2022-11-23T02:11:25.0294685Z ---submod_1 graph--- 2022-11-23T02:11:25.0294775Z graph(): 2022-11-23T02:11:25.0294946Z %self_seq_3 : [#users=1] = placeholder[target=self_seq_3] 2022-11-23T02:11:25.0295180Z %self_seq_4_linear : [#users=1] = call_module[target=self_seq_4_linear](args = (%self_seq_3,), kwargs = {}) 2022-11-23T02:11:25.0295401Z %self_seq_5 : [#users=1] = call_module[target=self_seq_5](args = (%self_seq_4_linear,), kwargs = {}) 2022-11-23T02:11:25.0295497Z return self_seq_5 2022-11-23T02:11:25.0295521Z 2022-11-23T02:11:25.0295660Z --------------- 2022-11-23T02:11:25.0295680Z 2022-11-23T02:11:25.0296046Z [2022-11-23 02:11:13,781] torch._dynamo.optimizations.distributed: [DEBUG] run_node placeholder, x got args tuple() 2022-11-23T02:11:25.0296437Z [2022-11-23 02:11:13,781] torch._dynamo.optimizations.distributed: [DEBUG] run_node get_attr, self_seq_2_weight got args tuple() 2022-11-23T02:11:25.0296914Z [2022-11-23 02:11:13,781] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_0 got args tuple(T[torch.Size([512, 512])], T[torch.Size([512, 512])]) 2022-11-23T02:11:25.0297194Z [2022-11-23 02:11:13,781] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0297346Z ---submod_0 graph--- 2022-11-23T02:11:25.0297445Z graph(): 2022-11-23T02:11:25.0297598Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0297787Z %self_seq_2_weight : [#users=1] = placeholder[target=self_seq_2_weight] 2022-11-23T02:11:25.0297995Z %self_seq_0_linear : [#users=1] = call_module[target=self_seq_0_linear](args = (%x,), kwargs = {}) 2022-11-23T02:11:25.0298215Z %self_seq_1 : [#users=1] = call_module[target=self_seq_1](args = (%self_seq_0_linear,), kwargs = {}) 2022-11-23T02:11:25.0298414Z %t : [#users=1] = call_method[target=t](args = (%self_seq_2_weight,), kwargs = {}) 2022-11-23T02:11:25.0298625Z %mm : [#users=1] = call_function[target=torch.mm](args = (%self_seq_1, %t), kwargs = {}) 2022-11-23T02:11:25.0298825Z %self_seq_3 : [#users=1] = call_module[target=self_seq_3](args = (%mm,), kwargs = {}) 2022-11-23T02:11:25.0298928Z return self_seq_3 2022-11-23T02:11:25.0299346Z [2022-11-23 02:11:13,782] torch._dynamo.optimizations.distributed: [DEBUG] run_node call_module, submod_1 got args tuple(T[torch.Size([512, 512])]) 2022-11-23T02:11:25.0299630Z [2022-11-23 02:11:13,782] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0299779Z ---submod_1 graph--- 2022-11-23T02:11:25.0299878Z graph(): 2022-11-23T02:11:25.0300114Z %self_seq_3 : [#users=1] = placeholder[target=self_seq_3] 2022-11-23T02:11:25.0300345Z %self_seq_4_linear : [#users=1] = call_module[target=self_seq_4_linear](args = (%self_seq_3,), kwargs = {}) 2022-11-23T02:11:25.0300566Z %self_seq_5 : [#users=1] = call_module[target=self_seq_5](args = (%self_seq_4_linear,), kwargs = {}) 2022-11-23T02:11:25.0300677Z return self_seq_5 2022-11-23T02:11:25.0301092Z [2022-11-23 02:11:13,783] torch._dynamo.optimizations.distributed: [DEBUG] run_node output, output got args tuple(tuple(T[torch.Size([512, 512])])) 2022-11-23T02:11:25.0301373Z [2022-11-23 02:11:13,783] torch._dynamo.optimizations.distributed: [DEBUG] 2022-11-23T02:11:25.0301516Z ---final graph--- 2022-11-23T02:11:25.0301618Z graph(): 2022-11-23T02:11:25.0301788Z %x : torch.Tensor [#users=1] = placeholder[target=x] 2022-11-23T02:11:25.0301961Z %self_seq_2_weight : [#users=1] = get_attr[target=self_seq_2_weight] 2022-11-23T02:11:25.0302195Z %submod_0 : [#users=1] = call_module[target=compiled_submod_0](args = (%x, %self_seq_2_weight), kwargs = {}) 2022-11-23T02:11:25.0302418Z %submod_1 : [#users=1] = call_module[target=compiled_submod_1](args = (%submod_0,), kwargs = {}) 2022-11-23T02:11:25.0302515Z return (submod_1,) 2022-11-23T02:11:25.0302656Z --------------- 2022-11-23T02:11:25.0302678Z 2022-11-23T02:11:25.0303053Z [2022-11-23 02:11:13,783] torch._dynamo.output_graph: [INFO] Step 2: done compiler function compile_fn 2022-11-23T02:11:25.0303329Z [2022-11-23 02:11:13,784] torch._dynamo.output_graph: [CODE] TRACED GRAPH 2022-11-23T02:11:25.0303520Z __compiled_fn_4 .53 opcode, name, target, args, kwargs 2022-11-23T02:11:25.0303644Z placeholder, x, x, (), {} 2022-11-23T02:11:25.0303819Z call_module, self_seq_0_linear, self_seq_0_linear, (x,), {} 2022-11-23T02:11:25.0303992Z call_module, self_seq_1, self_seq_1, (self_seq_0_linear,), {} 2022-11-23T02:11:25.0304138Z get_attr, self_seq_2_weight, self_seq_2_weight, (), {} 2022-11-23T02:11:25.0304275Z call_method, t, t, (self_seq_2_weight,), {} 2022-11-23T02:11:25.0304594Z call_function, mm, , (self_seq_1, t), {} 2022-11-23T02:11:25.0304745Z call_module, self_seq_3, self_seq_3, (mm,), {} 2022-11-23T02:11:25.0304926Z call_module, self_seq_4_linear, self_seq_4_linear, (self_seq_3,), {} 2022-11-23T02:11:25.0305087Z call_module, self_seq_5, self_seq_5, (self_seq_4_linear,), {} 2022-11-23T02:11:25.0305238Z output, output, output, ((self_seq_5,),), {} 2022-11-23T02:11:25.0305260Z 2022-11-23T02:11:25.0305725Z [2022-11-23 02:11:13,784] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 91 2022-11-23T02:11:25.0305847Z 92 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0305984Z 2 LOAD_METHOD 0 (seq) 2022-11-23T02:11:25.0306112Z 4 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0306230Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0306346Z 8 RETURN_VALUE 2022-11-23T02:11:25.0306365Z 2022-11-23T02:11:25.0306459Z 2022-11-23T02:11:25.0306913Z [2022-11-23 02:11:13,784] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 91 2022-11-23T02:11:25.0307070Z 91 0 LOAD_GLOBAL 1 (__compiled_fn_4) 2022-11-23T02:11:25.0307189Z 2 LOAD_FAST 1 (x) 2022-11-23T02:11:25.0307309Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0307438Z 6 UNPACK_SEQUENCE 1 2022-11-23T02:11:25.0307551Z 8 RETURN_VALUE 2022-11-23T02:11:25.0307571Z 2022-11-23T02:11:25.0307662Z 2022-11-23T02:11:25.0307910Z [2022-11-23 02:11:13,785] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0308013Z - 2022-11-23T02:11:25.0308166Z local 'x' TENSOR_MATCH 2022-11-23T02:11:25.0308266Z { 2022-11-23T02:11:25.0308468Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0308696Z 'code': None, 2022-11-23T02:11:25.0308980Z 'obj_weakref': 2022-11-23T02:11:25.0309319Z 'guarded_class': 2022-11-23T02:11:25.0309416Z } 2022-11-23T02:11:25.0309496Z 2022-11-23T02:11:25.0309605Z - 2022-11-23T02:11:25.0309765Z local 'self' NN_MODULE 2022-11-23T02:11:25.0309858Z { 2022-11-23T02:11:25.0310050Z 'guard_types': ['ID_MATCH'], 2022-11-23T02:11:25.0310281Z 'code': ['___check_obj_id(self, 140050948882192)'], 2022-11-23T02:11:25.0310574Z 'obj_weakref': 2022-11-23T02:11:25.0310878Z 'guarded_class': 2022-11-23T02:11:25.0310971Z } 2022-11-23T02:11:25.0311066Z 2022-11-23T02:11:25.0311179Z - 2022-11-23T02:11:25.0311368Z global 'torch' FUNCTION_MATCH 2022-11-23T02:11:25.0311465Z { 2022-11-23T02:11:25.0311625Z 'guard_types': None, 2022-11-23T02:11:25.0311762Z 'code': None, 2022-11-23T02:11:25.0311929Z 'obj_weakref': None 2022-11-23T02:11:25.0312145Z 'guarded_class': None 2022-11-23T02:11:25.0312247Z } 2022-11-23T02:11:25.0312345Z 2022-11-23T02:11:25.0312441Z - 2022-11-23T02:11:25.0312630Z local_nn_module 'self.seq' NN_MODULE 2022-11-23T02:11:25.0312727Z { 2022-11-23T02:11:25.0312895Z 'guard_types': None, 2022-11-23T02:11:25.0313047Z 'code': None, 2022-11-23T02:11:25.0313201Z 'obj_weakref': None 2022-11-23T02:11:25.0313369Z 'guarded_class': None 2022-11-23T02:11:25.0313464Z } 2022-11-23T02:11:25.0313548Z 2022-11-23T02:11:25.0313644Z - 2022-11-23T02:11:25.0313857Z local_nn_module 'self.seq[0]' NN_MODULE 2022-11-23T02:11:25.0313953Z { 2022-11-23T02:11:25.0314111Z 'guard_types': None, 2022-11-23T02:11:25.0314260Z 'code': None, 2022-11-23T02:11:25.0314424Z 'obj_weakref': None 2022-11-23T02:11:25.0314576Z 'guarded_class': None 2022-11-23T02:11:25.0314672Z } 2022-11-23T02:11:25.0314759Z 2022-11-23T02:11:25.0314863Z - 2022-11-23T02:11:25.0315309Z local_nn_module 'self.seq[1]' NN_MODULE 2022-11-23T02:11:25.0315416Z { 2022-11-23T02:11:25.0315582Z 'guard_types': None, 2022-11-23T02:11:25.0315718Z 'code': None, 2022-11-23T02:11:25.0315882Z 'obj_weakref': None 2022-11-23T02:11:25.0316051Z 'guarded_class': None 2022-11-23T02:11:25.0316147Z } 2022-11-23T02:11:25.0316235Z 2022-11-23T02:11:25.0316345Z - 2022-11-23T02:11:25.0316537Z local_nn_module 'self.seq[2]' NN_MODULE 2022-11-23T02:11:25.0316633Z { 2022-11-23T02:11:25.0316800Z 'guard_types': None, 2022-11-23T02:11:25.0316952Z 'code': None, 2022-11-23T02:11:25.0317106Z 'obj_weakref': None 2022-11-23T02:11:25.0317272Z 'guarded_class': None 2022-11-23T02:11:25.0317371Z } 2022-11-23T02:11:25.0317450Z 2022-11-23T02:11:25.0317556Z - 2022-11-23T02:11:25.0317761Z local_nn_module 'self.seq[3]' NN_MODULE 2022-11-23T02:11:25.0317848Z { 2022-11-23T02:11:25.0318014Z 'guard_types': None, 2022-11-23T02:11:25.0318165Z 'code': None, 2022-11-23T02:11:25.0318329Z 'obj_weakref': None 2022-11-23T02:11:25.0318481Z 'guarded_class': None 2022-11-23T02:11:25.0318567Z } 2022-11-23T02:11:25.0318659Z 2022-11-23T02:11:25.0318764Z - 2022-11-23T02:11:25.0319064Z local_nn_module 'self.seq[4]' NN_MODULE 2022-11-23T02:11:25.0319152Z { 2022-11-23T02:11:25.0319302Z 'guard_types': None, 2022-11-23T02:11:25.0319451Z 'code': None, 2022-11-23T02:11:25.0319617Z 'obj_weakref': None 2022-11-23T02:11:25.0319785Z 'guarded_class': None 2022-11-23T02:11:25.0319880Z } 2022-11-23T02:11:25.0319971Z 2022-11-23T02:11:25.0320074Z - 2022-11-23T02:11:25.0320261Z local_nn_module 'self.seq[5]' NN_MODULE 2022-11-23T02:11:25.0320360Z { 2022-11-23T02:11:25.0320530Z 'guard_types': None, 2022-11-23T02:11:25.0320671Z 'code': None, 2022-11-23T02:11:25.0320832Z 'obj_weakref': None 2022-11-23T02:11:25.0321001Z 'guarded_class': None 2022-11-23T02:11:25.0321095Z } 2022-11-23T02:11:25.0321173Z 2022-11-23T02:11:25.0321275Z - 2022-11-23T02:11:25.0321498Z local_nn_module 'self.seq[0].linear' NN_MODULE 2022-11-23T02:11:25.0321594Z { 2022-11-23T02:11:25.0321761Z 'guard_types': None, 2022-11-23T02:11:25.0321912Z 'code': None, 2022-11-23T02:11:25.0322060Z 'obj_weakref': None 2022-11-23T02:11:25.0322219Z 'guarded_class': None 2022-11-23T02:11:25.0322312Z } 2022-11-23T02:11:25.0322480Z 2022-11-23T02:11:25.0322597Z - 2022-11-23T02:11:25.0322825Z local_nn_module 'self.seq[2].weight' TENSOR_MATCH 2022-11-23T02:11:25.0322921Z { 2022-11-23T02:11:25.0323072Z 'guard_types': None, 2022-11-23T02:11:25.0323224Z 'code': None, 2022-11-23T02:11:25.0323389Z 'obj_weakref': None 2022-11-23T02:11:25.0323557Z 'guarded_class': None 2022-11-23T02:11:25.0323643Z } 2022-11-23T02:11:25.0323738Z 2022-11-23T02:11:25.0323832Z - 2022-11-23T02:11:25.0324042Z local_nn_module 'self.seq[4].linear' NN_MODULE 2022-11-23T02:11:25.0324143Z { 2022-11-23T02:11:25.0324308Z 'guard_types': None, 2022-11-23T02:11:25.0324459Z 'code': None, 2022-11-23T02:11:25.0324614Z 'obj_weakref': None 2022-11-23T02:11:25.0324781Z 'guarded_class': None 2022-11-23T02:11:25.0324878Z } 2022-11-23T02:11:25.0324961Z 2022-11-23T02:11:25.0325121Z frames [('total', 2), ('ok', 2)] 2022-11-23T02:11:25.0325428Z inline_call [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0325545Z unimplemented [] 2022-11-23T02:11:25.0325847Z graph_break [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0326121Z stats [('calls_captured', 7), ('fusions_possible', 6), ('unique_graphs', 1)] 2022-11-23T02:11:25.0326217Z ok (0.066s) 2022-11-23T02:11:25.0326351Z test_no_split (__main__.TestDistributed) 2022-11-23T02:11:25.0326835Z Ensures the DDPOptimizer returns a correct, compiled module without ... [2022-11-23 02:11:13,792] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing opt_fn 2022-11-23T02:11:25.0327287Z [2022-11-23 02:11:13,792] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:417 2022-11-23T02:11:25.0327596Z [2022-11-23 02:11:13,792] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF ddp_m [] 2022-11-23T02:11:25.0328040Z [2022-11-23 02:11:13,792] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0328515Z [2022-11-23 02:11:13,792] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UnspecializedNNModuleVariable(DistributedDataParallel), TensorVariable()] 2022-11-23T02:11:25.0329050Z [2022-11-23 02:11:13,794] torch._dynamo.symbolic_convert: [DEBUG] INLINING 2022-11-23T02:11:25.0329254Z 1057 0 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0329394Z 2 LOAD_ATTR 1 (autograd) 2022-11-23T02:11:25.0329522Z 4 LOAD_ATTR 2 (profiler) 2022-11-23T02:11:25.0329678Z 6 LOAD_METHOD 3 (record_function) 2022-11-23T02:11:25.0329699Z 2022-11-23T02:11:25.0329988Z 1058 8 LOAD_CONST 1 ('DistributedDataParallel.forward') 2022-11-23T02:11:25.0330008Z 2022-11-23T02:11:25.0330138Z 1057 10 CALL_METHOD 1 2022-11-23T02:11:25.0330279Z 12 SETUP_WITH 147 (to 308) 2022-11-23T02:11:25.0330388Z 14 POP_TOP 2022-11-23T02:11:25.0330408Z 2022-11-23T02:11:25.0330542Z 1060 16 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0330696Z 18 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0330820Z 20 CALL_METHOD 0 2022-11-23T02:11:25.0330953Z 22 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:25.0331085Z 24 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0331256Z 26 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:25.0331401Z 28 POP_JUMP_IF_FALSE 39 (to 78) 2022-11-23T02:11:25.0331421Z 2022-11-23T02:11:25.0331608Z 1061 30 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0331755Z 32 LOAD_ATTR 6 (logger) 2022-11-23T02:11:25.0331885Z 34 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0331993Z 36 IS_OP 1 2022-11-23T02:11:25.0332138Z 38 POP_JUMP_IF_TRUE 22 (to 44) 2022-11-23T02:11:25.0332266Z 40 LOAD_ASSERTION_ERROR 2022-11-23T02:11:25.0332400Z 42 RAISE_VARARGS 1 2022-11-23T02:11:25.0332420Z 2022-11-23T02:11:25.0332552Z 1062 >> 44 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0332690Z 46 LOAD_ATTR 6 (logger) 2022-11-23T02:11:25.0332858Z 48 LOAD_METHOD 7 (set_runtime_stats_and_log) 2022-11-23T02:11:25.0332988Z 50 CALL_METHOD 0 2022-11-23T02:11:25.0333079Z 52 POP_TOP 2022-11-23T02:11:25.0333099Z 2022-11-23T02:11:25.0333235Z 1063 54 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0333341Z 56 DUP_TOP 2022-11-23T02:11:25.0333489Z 58 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0333624Z 60 LOAD_CONST 2 (1) 2022-11-23T02:11:25.0333740Z 62 INPLACE_ADD 2022-11-23T02:11:25.0333844Z 64 ROT_TWO 2022-11-23T02:11:25.0333975Z 66 STORE_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0333994Z 2022-11-23T02:11:25.0334127Z 1064 68 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0334265Z 70 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0334432Z 72 LOAD_METHOD 10 (prepare_for_forward) 2022-11-23T02:11:25.0334564Z 74 CALL_METHOD 0 2022-11-23T02:11:25.0334672Z 76 POP_TOP 2022-11-23T02:11:25.0334691Z 2022-11-23T02:11:25.0334826Z 1068 >> 78 LOAD_GLOBAL 11 (Join) 2022-11-23T02:11:25.0334985Z 80 LOAD_METHOD 12 (notify_join_context) 2022-11-23T02:11:25.0335103Z 82 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0335229Z 84 CALL_METHOD 1 2022-11-23T02:11:25.0335359Z 86 STORE_FAST 3 (work) 2022-11-23T02:11:25.0335378Z 2022-11-23T02:11:25.0335506Z 1069 88 LOAD_FAST 3 (work) 2022-11-23T02:11:25.0335646Z 90 POP_JUMP_IF_FALSE 54 (to 108) 2022-11-23T02:11:25.0335665Z 2022-11-23T02:11:25.0335789Z 1070 92 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0336010Z 94 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0336187Z 96 LOAD_METHOD 13 (_set_forward_pass_work_handle) 2022-11-23T02:11:25.0336207Z 2022-11-23T02:11:25.0336320Z 1071 98 LOAD_FAST 3 (work) 2022-11-23T02:11:25.0336451Z 100 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0336624Z 102 LOAD_ATTR 14 (_divide_by_initial_world_size) 2022-11-23T02:11:25.0336643Z 2022-11-23T02:11:25.0336776Z 1070 104 CALL_METHOD 2 2022-11-23T02:11:25.0336885Z 106 POP_TOP 2022-11-23T02:11:25.0336904Z 2022-11-23T02:11:25.0337039Z 1080 >> 108 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0337191Z 110 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0337314Z 112 CALL_METHOD 0 2022-11-23T02:11:25.0337442Z 114 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:25.0337571Z 116 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0337710Z 118 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0337863Z 120 LOAD_METHOD 15 (_rebuild_buckets) 2022-11-23T02:11:25.0337990Z 122 CALL_METHOD 0 2022-11-23T02:11:25.0338133Z 124 POP_JUMP_IF_FALSE 71 (to 142) 2022-11-23T02:11:25.0338152Z 2022-11-23T02:11:25.0338339Z 1081 126 LOAD_GLOBAL 6 (logger) 2022-11-23T02:11:25.0338483Z 128 LOAD_METHOD 16 (info) 2022-11-23T02:11:25.0338503Z 2022-11-23T02:11:25.0338803Z 1082 130 LOAD_CONST 3 ('Reducer buckets have been rebuilt in this iteration.') 2022-11-23T02:11:25.0338842Z 2022-11-23T02:11:25.0338954Z 1081 132 CALL_METHOD 1 2022-11-23T02:11:25.0339066Z 134 POP_TOP 2022-11-23T02:11:25.0339085Z 2022-11-23T02:11:25.0339219Z 1084 136 LOAD_CONST 4 (True) 2022-11-23T02:11:25.0339350Z 138 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0339515Z 140 STORE_ATTR 17 (_has_rebuilt_buckets) 2022-11-23T02:11:25.0339535Z 2022-11-23T02:11:25.0339675Z 1088 >> 142 LOAD_GLOBAL 18 (hasattr) 2022-11-23T02:11:25.0339806Z 144 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0340011Z 146 LOAD_CONST 5 ('buffer_hook') 2022-11-23T02:11:25.0340149Z 148 CALL_FUNCTION 2 2022-11-23T02:11:25.0340315Z 150 STORE_FAST 4 (buffer_hook_registered) 2022-11-23T02:11:25.0340336Z 2022-11-23T02:11:25.0340466Z 1089 152 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0340631Z 154 LOAD_METHOD 19 (_check_sync_bufs_pre_fwd) 2022-11-23T02:11:25.0340760Z 156 CALL_METHOD 0 2022-11-23T02:11:25.0340898Z 158 POP_JUMP_IF_FALSE 84 (to 168) 2022-11-23T02:11:25.0340917Z 2022-11-23T02:11:25.0341048Z 1090 160 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0341184Z 162 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:25.0341311Z 164 CALL_METHOD 0 2022-11-23T02:11:25.0341417Z 166 POP_TOP 2022-11-23T02:11:25.0341435Z 2022-11-23T02:11:25.0341566Z 1092 >> 168 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0341714Z 170 LOAD_ATTR 21 (_join_config) 2022-11-23T02:11:25.0341849Z 172 LOAD_ATTR 22 (enable) 2022-11-23T02:11:25.0341993Z 174 POP_JUMP_IF_FALSE 94 (to 188) 2022-11-23T02:11:25.0342012Z 2022-11-23T02:11:25.0342140Z 1094 176 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0342314Z 178 LOAD_ATTR 23 (_check_global_requires_backward_grad_sync) 2022-11-23T02:11:25.0342350Z 2022-11-23T02:11:25.0342468Z 1095 180 LOAD_CONST 6 (False) 2022-11-23T02:11:25.0342487Z 2022-11-23T02:11:25.0342719Z 1094 182 LOAD_CONST 7 (('is_joined_rank',)) 2022-11-23T02:11:25.0342921Z 184 CALL_FUNCTION_KW 1 2022-11-23T02:11:25.0343031Z 186 POP_TOP 2022-11-23T02:11:25.0343051Z 2022-11-23T02:11:25.0343182Z 1098 >> 188 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0343337Z 190 LOAD_ATTR 24 (_run_ddp_forward) 2022-11-23T02:11:25.0343475Z 192 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0343605Z 194 BUILD_MAP 0 2022-11-23T02:11:25.0343723Z 196 LOAD_FAST 2 (kwargs) 2022-11-23T02:11:25.0343849Z 198 DICT_MERGE 1 2022-11-23T02:11:25.0343981Z 200 CALL_FUNCTION_EX 1 2022-11-23T02:11:25.0344116Z 202 STORE_FAST 5 (output) 2022-11-23T02:11:25.0344136Z 2022-11-23T02:11:25.0344265Z 1102 204 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0344433Z 206 LOAD_METHOD 25 (_check_sync_bufs_post_fwd) 2022-11-23T02:11:25.0344563Z 208 CALL_METHOD 0 2022-11-23T02:11:25.0344690Z 210 POP_JUMP_IF_FALSE 110 (to 220) 2022-11-23T02:11:25.0344727Z 2022-11-23T02:11:25.0344841Z 1103 212 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0344990Z 214 LOAD_METHOD 20 (_sync_buffers) 2022-11-23T02:11:25.0345233Z 216 CALL_METHOD 0 2022-11-23T02:11:25.0345350Z 218 POP_TOP 2022-11-23T02:11:25.0345369Z 2022-11-23T02:11:25.0345507Z 1105 >> 220 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0345661Z 222 LOAD_METHOD 4 (is_grad_enabled) 2022-11-23T02:11:25.0345790Z 224 CALL_METHOD 0 2022-11-23T02:11:25.0345917Z 226 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:25.0346049Z 228 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0346219Z 230 LOAD_ATTR 5 (require_backward_grad_sync) 2022-11-23T02:11:25.0346372Z 232 POP_JUMP_IF_FALSE 144 (to 288) 2022-11-23T02:11:25.0346391Z 2022-11-23T02:11:25.0346523Z 1106 234 LOAD_CONST 4 (True) 2022-11-23T02:11:25.0346655Z 236 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0346823Z 238 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:25.0346842Z 2022-11-23T02:11:25.0346975Z 1112 240 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0347123Z 242 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:25.0347263Z 244 POP_JUMP_IF_FALSE 137 (to 274) 2022-11-23T02:11:25.0347392Z 246 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0347540Z 248 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0347684Z 250 POP_JUMP_IF_TRUE 137 (to 274) 2022-11-23T02:11:25.0347703Z 2022-11-23T02:11:25.0347834Z 1114 252 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0347975Z 254 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0348141Z 256 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:25.0348161Z 2022-11-23T02:11:25.0348276Z 1115 258 LOAD_GLOBAL 30 (list) 2022-11-23T02:11:25.0348428Z 260 LOAD_GLOBAL 31 (_find_tensors) 2022-11-23T02:11:25.0348566Z 262 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0348697Z 264 CALL_FUNCTION 1 2022-11-23T02:11:25.0348828Z 266 CALL_FUNCTION 1 2022-11-23T02:11:25.0348847Z 2022-11-23T02:11:25.0348973Z 1114 268 CALL_METHOD 1 2022-11-23T02:11:25.0349078Z 270 POP_TOP 2022-11-23T02:11:25.0349200Z 272 JUMP_FORWARD 10 (to 294) 2022-11-23T02:11:25.0349240Z 2022-11-23T02:11:25.0349354Z 1118 >> 274 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0349491Z 276 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0349724Z 278 LOAD_METHOD 29 (prepare_for_backward) 2022-11-23T02:11:25.0349853Z 280 BUILD_LIST 0 2022-11-23T02:11:25.0349982Z 282 CALL_METHOD 1 2022-11-23T02:11:25.0350087Z 284 POP_TOP 2022-11-23T02:11:25.0350228Z 286 JUMP_FORWARD 3 (to 294) 2022-11-23T02:11:25.0350251Z 2022-11-23T02:11:25.0350370Z 1120 >> 288 LOAD_CONST 6 (False) 2022-11-23T02:11:25.0350504Z 290 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0350674Z 292 STORE_ATTR 26 (require_forward_param_sync) 2022-11-23T02:11:25.0350786Z >> 294 POP_BLOCK 2022-11-23T02:11:25.0350805Z 2022-11-23T02:11:25.0350939Z 1057 296 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0351044Z 298 DUP_TOP 2022-11-23T02:11:25.0351148Z 300 DUP_TOP 2022-11-23T02:11:25.0351279Z 302 CALL_FUNCTION 3 2022-11-23T02:11:25.0351371Z 304 POP_TOP 2022-11-23T02:11:25.0351507Z 306 JUMP_FORWARD 8 (to 324) 2022-11-23T02:11:25.0351630Z >> 308 WITH_EXCEPT_START 2022-11-23T02:11:25.0351773Z 310 POP_JUMP_IF_TRUE 157 (to 314) 2022-11-23T02:11:25.0351900Z 312 RERAISE 1 2022-11-23T02:11:25.0352066Z >> 314 POP_TOP 2022-11-23T02:11:25.0352182Z 316 POP_TOP 2022-11-23T02:11:25.0352270Z 318 POP_TOP 2022-11-23T02:11:25.0352375Z 320 POP_EXCEPT 2022-11-23T02:11:25.0352481Z 322 POP_TOP 2022-11-23T02:11:25.0352501Z 2022-11-23T02:11:25.0352635Z 1124 >> 324 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0352800Z 326 LOAD_ATTR 27 (find_unused_parameters) 2022-11-23T02:11:25.0352942Z 328 POP_JUMP_IF_FALSE 168 (to 336) 2022-11-23T02:11:25.0353073Z 330 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0353208Z 332 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0353348Z 334 POP_JUMP_IF_FALSE 178 (to 356) 2022-11-23T02:11:25.0353367Z 2022-11-23T02:11:25.0353496Z 1125 >> 336 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0353642Z 338 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0353662Z 2022-11-23T02:11:25.0353796Z 1124 340 EXTENDED_ARG 1 2022-11-23T02:11:25.0353939Z 342 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:25.0353958Z 2022-11-23T02:11:25.0354082Z 1125 344 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0354235Z 346 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0354352Z 348 LOAD_CONST 2 (1) 2022-11-23T02:11:25.0354488Z 350 COMPARE_OP 2 (==) 2022-11-23T02:11:25.0354507Z 2022-11-23T02:11:25.0354638Z 1124 352 EXTENDED_ARG 1 2022-11-23T02:11:25.0354785Z 354 POP_JUMP_IF_FALSE 265 (to 530) 2022-11-23T02:11:25.0354804Z 2022-11-23T02:11:25.0354933Z 1128 >> 356 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0355247Z 358 LOAD_ATTR 28 (static_graph) 2022-11-23T02:11:25.0355269Z 2022-11-23T02:11:25.0355409Z 1129 360 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0355564Z 362 LOAD_ATTR 8 (num_iterations) 2022-11-23T02:11:25.0355584Z 2022-11-23T02:11:25.0355866Z 1127 364 LOAD_CONST 8 (('static_graph', 'num_iterations')) 2022-11-23T02:11:25.0355987Z 366 BUILD_CONST_KEY_MAP 2 2022-11-23T02:11:25.0356131Z 368 STORE_FAST 6 (state_dict) 2022-11-23T02:11:25.0356150Z 2022-11-23T02:11:25.0356318Z 1136 370 LOAD_GLOBAL 32 (_tree_flatten_with_rref) 2022-11-23T02:11:25.0356455Z 372 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0356587Z 374 CALL_FUNCTION 1 2022-11-23T02:11:25.0356697Z 2022-11-23T02:11:25.0356842Z 1132 376 UNPACK_SEQUENCE 3 2022-11-23T02:11:25.0356861Z 2022-11-23T02:11:25.0357017Z 1133 378 STORE_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0357036Z 2022-11-23T02:11:25.0357184Z 1134 380 STORE_FAST 8 (treespec) 2022-11-23T02:11:25.0357203Z 2022-11-23T02:11:25.0357360Z 1135 382 STORE_FAST 9 (output_is_rref) 2022-11-23T02:11:25.0357380Z 2022-11-23T02:11:25.0357856Z 1137 384 LOAD_CONST 9 ( at 0x7f6052ac4660, file "/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1137>) 2022-11-23T02:11:25.0358187Z 386 LOAD_CONST 10 ('DistributedDataParallel.forward..') 2022-11-23T02:11:25.0358323Z 388 MAKE_FUNCTION 0 2022-11-23T02:11:25.0358462Z 390 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:25.0358597Z 392 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:25.0358758Z 394 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0358888Z 396 CALL_FUNCTION 1 2022-11-23T02:11:25.0359017Z 398 CALL_FUNCTION 1 2022-11-23T02:11:25.0359107Z 400 GET_ITER 2022-11-23T02:11:25.0359312Z 402 CALL_FUNCTION 1 2022-11-23T02:11:25.0359492Z 404 STORE_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0359511Z 2022-11-23T02:11:25.0359651Z 1140 406 LOAD_GLOBAL 35 (enumerate) 2022-11-23T02:11:25.0359809Z 408 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0359938Z 410 CALL_FUNCTION 1 2022-11-23T02:11:25.0360042Z 412 GET_ITER 2022-11-23T02:11:25.0360163Z >> 414 FOR_ITER 18 (to 452) 2022-11-23T02:11:25.0360299Z 416 UNPACK_SEQUENCE 2 2022-11-23T02:11:25.0360438Z 418 STORE_FAST 11 (i) 2022-11-23T02:11:25.0360575Z 420 STORE_FAST 5 (output) 2022-11-23T02:11:25.0360594Z 2022-11-23T02:11:25.0360732Z 1141 422 LOAD_GLOBAL 0 (torch) 2022-11-23T02:11:25.0360876Z 424 LOAD_METHOD 36 (is_tensor) 2022-11-23T02:11:25.0361017Z 426 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0361148Z 428 CALL_METHOD 1 2022-11-23T02:11:25.0361278Z 430 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:25.0361412Z 432 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0361547Z 434 LOAD_ATTR 37 (grad_fn) 2022-11-23T02:11:25.0361681Z 436 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0361806Z 438 IS_OP 0 2022-11-23T02:11:25.0361947Z 440 POP_JUMP_IF_FALSE 225 (to 450) 2022-11-23T02:11:25.0361970Z 2022-11-23T02:11:25.0362101Z 1142 442 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0362262Z 444 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0362378Z 446 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0362492Z 448 STORE_SUBSCR 2022-11-23T02:11:25.0362631Z >> 450 JUMP_ABSOLUTE 207 (to 414) 2022-11-23T02:11:25.0362651Z 2022-11-23T02:11:25.0362798Z 1149 >> 452 LOAD_GLOBAL 38 (_DDPSink) 2022-11-23T02:11:25.0362934Z 454 LOAD_ATTR 39 (apply) 2022-11-23T02:11:25.0362954Z 2022-11-23T02:11:25.0363079Z 1150 456 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0363216Z 458 LOAD_ATTR 9 (reducer) 2022-11-23T02:11:25.0363235Z 2022-11-23T02:11:25.0363377Z 1151 460 LOAD_FAST 6 (state_dict) 2022-11-23T02:11:25.0363396Z 2022-11-23T02:11:25.0363506Z 1149 462 BUILD_LIST 2 2022-11-23T02:11:25.0363586Z 2022-11-23T02:11:25.0363750Z 1152 464 LOAD_FAST 7 (output_tensor_list) 2022-11-23T02:11:25.0363770Z 2022-11-23T02:11:25.0363900Z 1149 466 LIST_EXTEND 1 2022-11-23T02:11:25.0364016Z 468 LIST_TO_TUPLE 2022-11-23T02:11:25.0364150Z 470 CALL_FUNCTION_EX 0 2022-11-23T02:11:25.0364327Z 472 STORE_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:25.0364347Z 2022-11-23T02:11:25.0364481Z 1154 474 LOAD_GLOBAL 33 (range) 2022-11-23T02:11:25.0364614Z 476 LOAD_GLOBAL 34 (len) 2022-11-23T02:11:25.0364761Z 478 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0364893Z 480 CALL_FUNCTION 1 2022-11-23T02:11:25.0365022Z 482 CALL_FUNCTION 1 2022-11-23T02:11:25.0365129Z 484 GET_ITER 2022-11-23T02:11:25.0365263Z >> 486 FOR_ITER 15 (to 518) 2022-11-23T02:11:25.0365400Z 488 STORE_FAST 11 (i) 2022-11-23T02:11:25.0365419Z 2022-11-23T02:11:25.0365579Z 1155 490 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0365711Z 492 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0365810Z 494 BINARY_SUBSCR 2022-11-23T02:11:25.0365943Z 496 LOAD_CONST 0 (None) 2022-11-23T02:11:25.0366121Z 498 IS_OP 0 2022-11-23T02:11:25.0366264Z 500 EXTENDED_ARG 1 2022-11-23T02:11:25.0366408Z 502 POP_JUMP_IF_FALSE 258 (to 516) 2022-11-23T02:11:25.0366428Z 2022-11-23T02:11:25.0366596Z 1156 504 LOAD_FAST 12 (passthrough_tensor_list) 2022-11-23T02:11:25.0366726Z 506 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0366825Z 508 BINARY_SUBSCR 2022-11-23T02:11:25.0366983Z 510 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0367119Z 512 LOAD_FAST 11 (i) 2022-11-23T02:11:25.0367232Z 514 STORE_SUBSCR 2022-11-23T02:11:25.0367373Z >> 516 JUMP_ABSOLUTE 243 (to 486) 2022-11-23T02:11:25.0367392Z 2022-11-23T02:11:25.0367561Z 1159 >> 518 LOAD_GLOBAL 40 (_tree_unflatten_with_rref) 2022-11-23T02:11:25.0367581Z 2022-11-23T02:11:25.0367745Z 1160 520 LOAD_FAST 10 (output_placeholders) 2022-11-23T02:11:25.0367889Z 522 LOAD_FAST 8 (treespec) 2022-11-23T02:11:25.0368024Z 524 LOAD_FAST 9 (output_is_rref) 2022-11-23T02:11:25.0368063Z 2022-11-23T02:11:25.0368177Z 1159 526 CALL_FUNCTION 3 2022-11-23T02:11:25.0368314Z 528 STORE_FAST 5 (output) 2022-11-23T02:11:25.0368333Z 2022-11-23T02:11:25.0368468Z 1162 >> 530 LOAD_FAST 5 (output) 2022-11-23T02:11:25.0368581Z 532 RETURN_VALUE 2022-11-23T02:11:25.0368678Z 2022-11-23T02:11:25.0368700Z 2022-11-23T02:11:25.0369177Z [2022-11-23 02:11:13,795] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:25.0369484Z [2022-11-23 02:11:13,796] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [] 2022-11-23T02:11:25.0369996Z [2022-11-23 02:11:13,796] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR autograd [TorchVariable()] 2022-11-23T02:11:25.0370533Z [2022-11-23 02:11:13,796] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR profiler [TorchVariable()] 2022-11-23T02:11:25.0371094Z [2022-11-23 02:11:13,796] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR record_function [TorchVariable()] 2022-11-23T02:11:25.0371615Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1058 2022-11-23T02:11:25.0372137Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST DistributedDataParallel.forward [TorchVariable()] 2022-11-23T02:11:25.0372589Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1057 2022-11-23T02:11:25.0373084Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(), ConstantVariable(str)] 2022-11-23T02:11:25.0373392Z [2022-11-23 02:11:13,797] torch._dynamo.variables.torch: [WARNING] Profiler will be ignored 2022-11-23T02:11:25.0373737Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE SETUP_WITH 308 [NullContextVariable()] 2022-11-23T02:11:25.0374164Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_TOP None [WithExitFunctionVariable(), ConstantVariable(NoneType)] 2022-11-23T02:11:25.0374614Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1060 2022-11-23T02:11:25.0375035Z [2022-11-23 02:11:13,797] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch [WithExitFunctionVariable()] 2022-11-23T02:11:25.0375624Z [2022-11-23 02:11:13,798] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_grad_enabled [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:25.0376088Z [2022-11-23 02:11:13,798] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), TorchVariable()] 2022-11-23T02:11:25.0376501Z [2022-11-23 02:11:13,798] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0376863Z [2022-11-23 02:11:13,798] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0377412Z [2022-11-23 02:11:13,798] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR require_backward_grad_sync [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0377840Z [2022-11-23 02:11:13,799] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 78 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0378287Z [2022-11-23 02:11:13,799] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1061 2022-11-23T02:11:25.0378643Z [2022-11-23 02:11:13,799] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0379149Z [2022-11-23 02:11:13,799] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0379588Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:25.0380076Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE IS_OP 1 [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger), ConstantVariable(NoneType)] 2022-11-23T02:11:25.0380500Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 44 [WithExitFunctionVariable(), ConstantVariable(bool)] 2022-11-23T02:11:25.0380947Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1062 2022-11-23T02:11:25.0381291Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [WithExitFunctionVariable()] 2022-11-23T02:11:25.0381864Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR logger [WithExitFunctionVariable(), UnspecializedNNModuleVariable(DistributedDataParallel)] 2022-11-23T02:11:25.0382348Z [2022-11-23 02:11:13,800] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR set_runtime_stats_and_log [WithExitFunctionVariable(), UserDefinedObjectVariable(Logger)] 2022-11-23T02:11:25.0382806Z [2022-11-23 02:11:13,801] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [WithExitFunctionVariable(), UserDefinedObjectVariable(instancemethod)] 2022-11-23T02:11:25.0383263Z [2022-11-23 02:11:13,802] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 415 2022-11-23T02:11:25.0383406Z 417 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:25.0383546Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:25.0383682Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0383799Z 6 RETURN_VALUE 2022-11-23T02:11:25.0383821Z 2022-11-23T02:11:25.0383898Z 2022-11-23T02:11:25.0384362Z [2022-11-23 02:11:13,802] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE opt_fn /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 415 2022-11-23T02:11:25.0384555Z 415 0 LOAD_DEREF 0 (ddp_m) 2022-11-23T02:11:25.0384702Z 2 LOAD_FAST 0 (inputs) 2022-11-23T02:11:25.0384723Z 2022-11-23T02:11:25.0384854Z 417 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0384968Z 6 RETURN_VALUE 2022-11-23T02:11:25.0384987Z 2022-11-23T02:11:25.0385080Z 2022-11-23T02:11:25.0385340Z [2022-11-23 02:11:13,803] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0385432Z - 2022-11-23T02:11:25.0385611Z local 'ddp_m' TYPE_MATCH 2022-11-23T02:11:25.0385709Z { 2022-11-23T02:11:25.0385912Z 'guard_types': ['TYPE_MATCH'], 2022-11-23T02:11:25.0386155Z 'code': ['___check_type_id(ddp_m, 94883284555664)'], 2022-11-23T02:11:25.0386498Z 'obj_weakref': 2022-11-23T02:11:25.0386866Z 'guarded_class': 2022-11-23T02:11:25.0386967Z } 2022-11-23T02:11:25.0387048Z 2022-11-23T02:11:25.0387158Z - 2022-11-23T02:11:25.0387340Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:25.0387439Z { 2022-11-23T02:11:25.0387640Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0387794Z 'code': None, 2022-11-23T02:11:25.0388088Z 'obj_weakref': 2022-11-23T02:11:25.0388415Z 'guarded_class': 2022-11-23T02:11:25.0388520Z } 2022-11-23T02:11:25.0388617Z 2022-11-23T02:11:25.0388956Z [2022-11-23 02:11:13,804] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward 2022-11-23T02:11:25.0389396Z [2022-11-23 02:11:13,805] torch._dynamo.symbolic_convert: [DEBUG] TRACE starts_line /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py:54 2022-11-23T02:11:25.0389691Z [2022-11-23 02:11:13,805] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [] 2022-11-23T02:11:25.0390024Z [2022-11-23 02:11:13,805] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR net [NNModuleVariable()] 2022-11-23T02:11:25.0390414Z [2022-11-23 02:11:13,805] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs [NNModuleVariable()] 2022-11-23T02:11:25.0390772Z [2022-11-23 02:11:13,805] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()] 2022-11-23T02:11:25.0391200Z [2022-11-23 02:11:13,841] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()] 2022-11-23T02:11:25.0391530Z [2022-11-23 02:11:13,841] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward 2022-11-23T02:11:25.0391857Z [2022-11-23 02:11:13,842] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function compile_fn 2022-11-23T02:11:25.0392296Z [2022-11-23 02:11:13,842] torch._dynamo.optimizations.distributed: [INFO] DDPOptimizer used bucket cap 262144000 and produced the following buckets: 2022-11-23T02:11:25.0392729Z [2022-11-23 02:11:13,843] torch._dynamo.optimizations.distributed: [INFO] Please `pip install tabulate` in order to pretty-print ddp bucket sizes 2022-11-23T02:11:25.0393053Z [2022-11-23 02:11:13,843] torch._dynamo.output_graph: [INFO] Step 2: done compiler function compile_fn 2022-11-23T02:11:25.0393317Z [2022-11-23 02:11:13,843] torch._dynamo.output_graph: [CODE] TRACED GRAPH 2022-11-23T02:11:25.0393513Z __compiled_fn_5 .61 opcode, name, target, args, kwargs 2022-11-23T02:11:25.0393640Z placeholder, inputs, inputs, (), {} 2022-11-23T02:11:25.0393800Z call_module, self_net_0, self_net_0, (inputs,), {} 2022-11-23T02:11:25.0393960Z call_module, self_net_1, self_net_1, (self_net_0,), {} 2022-11-23T02:11:25.0394120Z call_module, self_net_2, self_net_2, (self_net_1,), {} 2022-11-23T02:11:25.0394341Z call_module, self_net_3, self_net_3, (self_net_2,), {} 2022-11-23T02:11:25.0394508Z call_module, self_net_4, self_net_4, (self_net_3,), {} 2022-11-23T02:11:25.0394663Z call_module, self_net_5, self_net_5, (self_net_4,), {} 2022-11-23T02:11:25.0394818Z call_module, self_net_6, self_net_6, (self_net_5,), {} 2022-11-23T02:11:25.0394957Z call_module, self_net_7, self_net_7, (self_net_6,), {} 2022-11-23T02:11:25.0395324Z output, output, output, ((self_net_7,),), {} 2022-11-23T02:11:25.0395346Z 2022-11-23T02:11:25.0395827Z [2022-11-23 02:11:13,843] torch._dynamo.convert_frame: [CODE] ORIGINAL BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:25.0395971Z 54 0 LOAD_FAST 0 (self) 2022-11-23T02:11:25.0396109Z 2 LOAD_METHOD 0 (net) 2022-11-23T02:11:25.0396247Z 4 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0396375Z 6 CALL_METHOD 1 2022-11-23T02:11:25.0396477Z 8 RETURN_VALUE 2022-11-23T02:11:25.0396515Z 2022-11-23T02:11:25.0396592Z 2022-11-23T02:11:25.0397062Z [2022-11-23 02:11:13,844] torch._dynamo.convert_frame: [CODE] MODIFIED BYTECODE forward /var/lib/jenkins/workspace/test/distributed/test_dynamo_distributed.py line 53 2022-11-23T02:11:25.0397221Z 53 0 LOAD_GLOBAL 1 (__compiled_fn_5) 2022-11-23T02:11:25.0397360Z 2 LOAD_FAST 1 (inputs) 2022-11-23T02:11:25.0397494Z 4 CALL_FUNCTION 1 2022-11-23T02:11:25.0397629Z 6 UNPACK_SEQUENCE 1 2022-11-23T02:11:25.0397744Z 8 RETURN_VALUE 2022-11-23T02:11:25.0397764Z 2022-11-23T02:11:25.0397842Z 2022-11-23T02:11:25.0398103Z [2022-11-23 02:11:13,844] torch._dynamo.convert_frame: [CODE] GUARDS: 2022-11-23T02:11:25.0398211Z - 2022-11-23T02:11:25.0398383Z local 'self' NN_MODULE 2022-11-23T02:11:25.0398482Z { 2022-11-23T02:11:25.0398674Z 'guard_types': ['ID_MATCH'], 2022-11-23T02:11:25.0398911Z 'code': ['___check_obj_id(self, 140050758384128)'], 2022-11-23T02:11:25.0399192Z 'obj_weakref': 2022-11-23T02:11:25.0399507Z 'guarded_class': 2022-11-23T02:11:25.0399608Z } 2022-11-23T02:11:25.0399705Z 2022-11-23T02:11:25.0399814Z - 2022-11-23T02:11:25.0399994Z local 'inputs' TENSOR_MATCH 2022-11-23T02:11:25.0400092Z { 2022-11-23T02:11:25.0400380Z 'guard_types': ['TENSOR_MATCH'], 2022-11-23T02:11:25.0400538Z 'code': None, 2022-11-23T02:11:25.0400825Z 'obj_weakref': 2022-11-23T02:11:25.0401163Z 'guarded_class': 2022-11-23T02:11:25.0401260Z } 2022-11-23T02:11:25.0401362Z 2022-11-23T02:11:25.0401471Z - 2022-11-23T02:11:25.0401660Z local_nn_module 'self.net' NN_MODULE 2022-11-23T02:11:25.0401760Z { 2022-11-23T02:11:25.0401930Z 'guard_types': None, 2022-11-23T02:11:25.0402083Z 'code': None, 2022-11-23T02:11:25.0402252Z 'obj_weakref': None 2022-11-23T02:11:25.0402419Z 'guarded_class': None 2022-11-23T02:11:25.0402517Z } 2022-11-23T02:11:25.0402597Z 2022-11-23T02:11:25.0402703Z - 2022-11-23T02:11:25.0402915Z local_nn_module 'self.net[0]' NN_MODULE 2022-11-23T02:11:25.0403013Z { 2022-11-23T02:11:25.0403183Z 'guard_types': None, 2022-11-23T02:11:25.0403334Z 'code': None, 2022-11-23T02:11:25.0403501Z 'obj_weakref': None 2022-11-23T02:11:25.0403653Z 'guarded_class': None 2022-11-23T02:11:25.0403751Z } 2022-11-23T02:11:25.0403918Z 2022-11-23T02:11:25.0404040Z - 2022-11-23T02:11:25.0404254Z local_nn_module 'self.net[1]' NN_MODULE 2022-11-23T02:11:25.0404350Z { 2022-11-23T02:11:25.0404518Z 'guard_types': None, 2022-11-23T02:11:25.0404652Z 'code': None, 2022-11-23T02:11:25.0404820Z 'obj_weakref': None 2022-11-23T02:11:25.0404987Z 'guarded_class': None 2022-11-23T02:11:25.0405079Z } 2022-11-23T02:11:25.0405179Z 2022-11-23T02:11:25.0405286Z - 2022-11-23T02:11:25.0405474Z local_nn_module 'self.net[2]' NN_MODULE 2022-11-23T02:11:25.0405578Z { 2022-11-23T02:11:25.0405748Z 'guard_types': None, 2022-11-23T02:11:25.0405899Z 'code': None, 2022-11-23T02:11:25.0406065Z 'obj_weakref': None 2022-11-23T02:11:25.0406234Z 'guarded_class': None 2022-11-23T02:11:25.0406331Z } 2022-11-23T02:11:25.0406412Z 2022-11-23T02:11:25.0406523Z - 2022-11-23T02:11:25.0406727Z local_nn_module 'self.net[3]' NN_MODULE 2022-11-23T02:11:25.0406824Z { 2022-11-23T02:11:25.0406993Z 'guard_types': None, 2022-11-23T02:11:25.0407144Z 'code': None, 2022-11-23T02:11:25.0407309Z 'obj_weakref': None 2022-11-23T02:11:25.0407463Z 'guarded_class': None 2022-11-23T02:11:25.0407560Z } 2022-11-23T02:11:25.0407656Z 2022-11-23T02:11:25.0407761Z - 2022-11-23T02:11:25.0407968Z local_nn_module 'self.net[4]' NN_MODULE 2022-11-23T02:11:25.0408070Z { 2022-11-23T02:11:25.0408220Z 'guard_types': None, 2022-11-23T02:11:25.0408370Z 'code': None, 2022-11-23T02:11:25.0408535Z 'obj_weakref': None 2022-11-23T02:11:25.0408706Z 'guarded_class': None 2022-11-23T02:11:25.0408803Z } 2022-11-23T02:11:25.0408898Z 2022-11-23T02:11:25.0409003Z - 2022-11-23T02:11:25.0409198Z local_nn_module 'self.net[5]' NN_MODULE 2022-11-23T02:11:25.0409298Z { 2022-11-23T02:11:25.0409470Z 'guard_types': None, 2022-11-23T02:11:25.0409620Z 'code': None, 2022-11-23T02:11:25.0409785Z 'obj_weakref': None 2022-11-23T02:11:25.0409952Z 'guarded_class': None 2022-11-23T02:11:25.0410050Z } 2022-11-23T02:11:25.0410129Z 2022-11-23T02:11:25.0410235Z - 2022-11-23T02:11:25.0410440Z local_nn_module 'self.net[6]' NN_MODULE 2022-11-23T02:11:25.0410606Z { 2022-11-23T02:11:25.0410777Z 'guard_types': None, 2022-11-23T02:11:25.0410931Z 'code': None, 2022-11-23T02:11:25.0411080Z 'obj_weakref': None 2022-11-23T02:11:25.0411249Z 'guarded_class': None 2022-11-23T02:11:25.0411346Z } 2022-11-23T02:11:25.0411445Z 2022-11-23T02:11:25.0411551Z - 2022-11-23T02:11:25.0411762Z local_nn_module 'self.net[7]' NN_MODULE 2022-11-23T02:11:25.0411859Z { 2022-11-23T02:11:25.0412012Z 'guard_types': None, 2022-11-23T02:11:25.0412162Z 'code': None, 2022-11-23T02:11:25.0412327Z 'obj_weakref': None 2022-11-23T02:11:25.0412494Z 'guarded_class': None 2022-11-23T02:11:25.0412592Z } 2022-11-23T02:11:25.0412688Z 2022-11-23T02:11:25.0412856Z frames [('total', 2), ('ok', 2)] 2022-11-23T02:11:25.0413149Z inline_call [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0413270Z unimplemented [] 2022-11-23T02:11:25.0413579Z graph_break [('call_function UserDefinedObjectVariable(instancemethod) [] {}', 1)] 2022-11-23T02:11:25.0413852Z stats [('calls_captured', 8), ('fusions_possible', 7), ('unique_graphs', 1)] 2022-11-23T02:11:25.0413956Z ok (0.059s) 2022-11-23T02:11:25.0414357Z test_ddp_baseline_aot_eager_multiprocess (__main__.TestDistributedMultiProc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43990 2022-11-23T02:11:25.0414590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43991 2022-11-23T02:11:25.0414975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:25.0415138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:25.0415526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:25.0415726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:25.0416102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:25.0416276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:25.0416664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:25.0416858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:25.0417092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:25.0417308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:25.0417561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:25.0417806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:25.0418218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:25.0418620Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:25.0418725Z ok (5.014s) 2022-11-23T02:11:25.0419043Z test_fsdp_aot_eager (__main__.TestDistributedMultiProc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44078 2022-11-23T02:11:25.0419266Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44079 2022-11-23T02:11:25.0419641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:25.0419802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:25.0420186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:25.0420451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:25.0420832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:25.0421009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:25.0421388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:25.0421580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:25.0421813Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:25.0422048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:25.0422278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:25.0422522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:25.0422935Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:25.0423335Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:25.0423489Z ok (5.614s) 2022-11-23T02:11:25.0423781Z test_fsdp_inductor (__main__.TestDistributedMultiProc) ... skip: Inductor+gpu needs triton and recent GPU arch (0.001s) 2022-11-23T02:11:25.0424057Z test_hf_bert_ddp (__main__.TestDistributedMultiProc) ... skip: Inductor+gpu needs triton and recent GPU arch (0.001s) 2022-11-23T02:11:25.0424333Z test_hf_bert_fsdp (__main__.TestDistributedMultiProc) ... skip: Inductor+gpu needs triton and recent GPU arch (0.001s) 2022-11-23T02:11:25.0424354Z 2022-11-23T02:11:25.0424626Z ---------------------------------------------------------------------- 2022-11-23T02:11:25.0424732Z Ran 14 tests in 15.041s 2022-11-23T02:11:25.0424751Z 2022-11-23T02:11:25.0424861Z OK (skipped=6) 2022-11-23T02:11:25.0424880Z 2022-11-23T02:11:25.0425006Z Generating XML reports... 2022-11-23T02:11:25.0425441Z Generated XML report: test-reports/python-unittest/distributed.test_dynamo_distributed/TEST-TestDistributed-20221123021109.xml 2022-11-23T02:11:25.0425914Z Generated XML report: test-reports/python-unittest/distributed.test_dynamo_distributed/TEST-TestDistributedMultiProc-20221123021109.xml 2022-11-23T02:11:25.0425935Z 2022-11-23T02:11:25.0426360Z ##[endgroup] 2022-11-23T02:11:25.0426840Z FINISHED PRINTING LOG FILE of distributed/test_dynamo_distributed (/var/lib/jenkins/workspace/test/test-reports/distributed-test_dynamo_distributed_kagpuq9v) 2022-11-23T02:11:25.0426859Z 2022-11-23T02:11:25.0427153Z Running distributed/fsdp/test_fsdp_ignored_modules ... [2022-11-23 02:11:24.962712] 2022-11-23T02:11:25.0427637Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:11:24.963034] 2022-11-23T02:11:47.5046331Z 2022-11-23T02:11:47.5046854Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_ignored_modules 2022-11-23T02:11:47.5049576Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_ignored_modules (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_ignored_modules_r_a71er2) 2022-11-23T02:11:47.5050317Z 2022-11-23T02:11:47.5050539Z Running tests... 2022-11-23T02:11:47.5051448Z ---------------------------------------------------------------------- 2022-11-23T02:11:47.5052053Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules 2022-11-23T02:11:47.5052575Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_False (__main__.TestFSDPIgnoredModules) 2022-11-23T02:11:47.5053091Z Tests ignoring different modules across ranks. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:11:47.5054179Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44201 2022-11-23T02:11:47.5055046Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44202 2022-11-23T02:11:47.5055686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5056154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5056745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5057204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5060873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5061380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5062013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5062509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5062961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:47.5063646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:47.5064342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5065046Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5065561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:47.5066037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:47.5066390Z dist init r=0, world=2 2022-11-23T02:11:47.5066651Z dist init r=1, world=2 2022-11-23T02:11:47.5066876Z ok (5.358s) 2022-11-23T02:11:47.5067264Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_True (__main__.TestFSDPIgnoredModules) 2022-11-23T02:11:47.5067830Z Tests ignoring different modules across ranks. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44284 2022-11-23T02:11:47.5068332Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44285 2022-11-23T02:11:47.5068961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5069416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5069992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5070464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5071063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5071499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5072083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5072600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5073066Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:47.5073575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:47.5074226Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5074925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5075908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:47.5076395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:47.5076741Z dist init r=0, world=2 2022-11-23T02:11:47.5077002Z dist init r=1, world=2 2022-11-23T02:11:47.5077250Z ok (3.811s) 2022-11-23T02:11:47.5077563Z test_ignored_modules_invalid (__main__.TestFSDPIgnoredModules) 2022-11-23T02:11:47.5078087Z Tests that passing an FSDP module as an ignored module or the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44367 2022-11-23T02:11:47.5078621Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44368 2022-11-23T02:11:47.5079231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5079694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5080283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5080759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5081328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5081876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5082477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5082952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5083387Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:47.5083884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:47.5084547Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5085227Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5085759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:47.5086238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:47.5086601Z dist init r=1, world=2 2022-11-23T02:11:47.5086840Z dist init r=0, world=2 2022-11-23T02:11:47.5087080Z ok (3.309s) 2022-11-23T02:11:47.5087404Z test_ignored_modules_nested (__main__.TestFSDPIgnoredModules) 2022-11-23T02:11:47.5087913Z Tests that passing a module with nested FSDP modules does not ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44446 2022-11-23T02:11:47.5088447Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44447 2022-11-23T02:11:47.5089073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5089532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5090098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5090572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5091328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5091763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5092342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5092807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5093262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:47.5093839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:47.5094504Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5095209Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5095741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:47.5096201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:47.5096553Z dist init r=1, world=2 2022-11-23T02:11:47.5096806Z dist init r=0, world=2 2022-11-23T02:11:47.5097028Z ok (3.710s) 2022-11-23T02:11:47.5097363Z test_ignored_modules_transformer (__main__.TestFSDPIgnoredModules) 2022-11-23T02:11:47.5098034Z Tests that ignored modules' parameters are not flattened for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44529 2022-11-23T02:11:47.5098554Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44530 2022-11-23T02:11:47.5099169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5099682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5100281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5100735Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5101320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:11:47.5101769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:11:47.5102353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:11:47.5102804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:11:47.5103259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:11:47.5103766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:11:47.5104414Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5105112Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:11:47.5105642Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:11:47.5106118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:11:47.5106465Z dist init r=0, world=2 2022-11-23T02:11:47.5106720Z dist init r=1, world=2 2022-11-23T02:11:47.5106961Z ok (4.012s) 2022-11-23T02:11:47.5107111Z 2022-11-23T02:11:47.5107371Z ---------------------------------------------------------------------- 2022-11-23T02:11:47.5107714Z Ran 5 tests in 20.201s 2022-11-23T02:11:47.5107880Z 2022-11-23T02:11:47.5107977Z OK 2022-11-23T02:11:47.5108115Z 2022-11-23T02:11:47.5108250Z Generating XML reports... 2022-11-23T02:11:47.5108879Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20221123021126.xml 2022-11-23T02:11:47.5109266Z 2022-11-23T02:11:47.5109726Z ##[endgroup] 2022-11-23T02:11:47.5110382Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_ignored_modules (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_ignored_modules_r_a71er2) 2022-11-23T02:11:47.5110764Z 2022-11-23T02:11:47.5111045Z Running distributed/_tensor/parallel/test_tp_style ... [2022-11-23 02:11:47.504815] 2022-11-23T02:11:47.5111858Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/parallel/test_tp_style.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:11:47.505155] 2022-11-23T02:12:11.0583452Z 2022-11-23T02:12:11.0584002Z Expand the folded group to see the log file of distributed/_tensor/parallel/test_tp_style 2022-11-23T02:12:11.0589409Z ##[group]PRINTING LOG FILE of distributed/_tensor/parallel/test_tp_style (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_tp_style_y_6953tb) 2022-11-23T02:12:11.0590187Z 2022-11-23T02:12:11.0590383Z Running tests... 2022-11-23T02:12:11.0591285Z ---------------------------------------------------------------------- 2022-11-23T02:12:11.0592207Z Test results will be stored in test-reports/python-unittest/distributed._tensor.parallel.test_tp_style 2022-11-23T02:12:11.0592773Z test_colwise_parallel_style (__main__.TensorParallelStyleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:11.0593290Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44647 2022-11-23T02:12:11.0593744Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44648 2022-11-23T02:12:11.0594325Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 44649 2022-11-23T02:12:11.0595738Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 44650 2022-11-23T02:12:11.0596463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0596931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0597516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0597975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0598586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0599042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0599629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0600084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0600670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0601130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0601694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0602179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0602770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0603236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0603793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0604282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0604735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0605219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0605683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0606152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0606543Z skip: Need at least 4 CUDA devices (4.034s) 2022-11-23T02:12:11.0607025Z test_make_input_replicate_1d (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44783 2022-11-23T02:12:11.0607700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44784 2022-11-23T02:12:11.0608152Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 44785 2022-11-23T02:12:11.0608680Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 44786 2022-11-23T02:12:11.0609307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0609759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0610322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0610797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0611380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0611838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0612401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0612872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0613521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0613966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0614541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0615011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0615593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0616025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0616609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0617077Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0617501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0617982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0618447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0618921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0619302Z skip: Need at least 4 CUDA devices (2.412s) 2022-11-23T02:12:11.0619792Z test_make_input_shard_1d (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44919 2022-11-23T02:12:11.0620338Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44920 2022-11-23T02:12:11.0620790Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 44921 2022-11-23T02:12:11.0621221Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 44922 2022-11-23T02:12:11.0621841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0622296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0622858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0623329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0623916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0624365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0625005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0625477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0626063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0626494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0627073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0627537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0628113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0628538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0629123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0629588Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0630027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0630546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0631023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0631489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0631867Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:12:11.0632365Z test_make_output_replicate_1d (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45055 2022-11-23T02:12:11.0632913Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45056 2022-11-23T02:12:11.0633372Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45057 2022-11-23T02:12:11.0633803Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45058 2022-11-23T02:12:11.0634415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0634870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0635916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0636393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0636977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0637422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0637990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0638457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0639038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0639490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0640053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0640517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0641094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0641521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0642097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0642681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0643115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0643576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0644047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0644517Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0644893Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:12:11.0645384Z test_make_output_shard_1d (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45191 2022-11-23T02:12:11.0645926Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45192 2022-11-23T02:12:11.0646378Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45193 2022-11-23T02:12:11.0646807Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45194 2022-11-23T02:12:11.0647425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0647952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0648533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0649005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0649585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0650034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0650593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0651065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0651646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0652090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0652654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0653120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0653698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0654123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0654705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0655175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0655610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0656068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0656536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0657007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0657388Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:12:11.0657880Z test_make_output_tensor (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45327 2022-11-23T02:12:11.0658415Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45328 2022-11-23T02:12:11.0658862Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45329 2022-11-23T02:12:11.0659371Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45330 2022-11-23T02:12:11.0659982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0660434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0661020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0661473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0662057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0662504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0663060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0663537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0664119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0664567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0665184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0665663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0666242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0666671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0667242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0667713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0668150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0668611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0669074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0669551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0669931Z skip: Need at least 4 CUDA devices (2.510s) 2022-11-23T02:12:11.0670429Z test_prepare_output_error (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45463 2022-11-23T02:12:11.0670975Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45464 2022-11-23T02:12:11.0671424Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45465 2022-11-23T02:12:11.0671864Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45466 2022-11-23T02:12:11.0672479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0672934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0673521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0673981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0674567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0675359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0676075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0676554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0677252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0677698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0678260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0678735Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0679313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0679761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0680315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0680778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0681224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0681685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0682150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0682700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0683112Z skip: Need at least 4 CUDA devices (2.611s) 2022-11-23T02:12:11.0683592Z test_rowwise_parallel_style (__main__.TensorParallelStyleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45599 2022-11-23T02:12:11.0684139Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45600 2022-11-23T02:12:11.0684587Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45601 2022-11-23T02:12:11.0685018Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45602 2022-11-23T02:12:11.0685640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0686095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0686670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0687127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0687708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0688154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0688716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0689180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0689775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0690220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0690781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0691250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0691830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:11.0692320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:11.0692879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:11.0693343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:11.0693779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:12:11.0694318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:12:11.0694779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:11.0695253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:11.0695652Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:12:11.0695833Z 2022-11-23T02:12:11.0696116Z ---------------------------------------------------------------------- 2022-11-23T02:12:11.0696454Z Ran 8 tests in 21.208s 2022-11-23T02:12:11.0696619Z 2022-11-23T02:12:11.0696729Z OK (skipped=8) 2022-11-23T02:12:11.0696885Z 2022-11-23T02:12:11.0696993Z Generating XML reports... 2022-11-23T02:12:11.0697637Z Generated XML report: test-reports/python-unittest/distributed._tensor.parallel.test_tp_style/TEST-TensorParallelStyleTest-20221123021149.xml 2022-11-23T02:12:11.0698033Z 2022-11-23T02:12:11.0698380Z ##[endgroup] 2022-11-23T02:12:11.0699003Z FINISHED PRINTING LOG FILE of distributed/_tensor/parallel/test_tp_style (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_tp_style_y_6953tb) 2022-11-23T02:12:11.0699384Z 2022-11-23T02:12:11.0699703Z Running distributed/algorithms/ddp_comm_hooks/test_ddp_hooks ... [2022-11-23 02:12:11.058525] 2022-11-23T02:12:11.0700532Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:12:11.058900] 2022-11-23T02:12:39.8579189Z 2022-11-23T02:12:39.8579746Z Expand the folded group to see the log file of distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 2022-11-23T02:12:39.8583827Z ##[group]PRINTING LOG FILE of distributed/algorithms/ddp_comm_hooks/test_ddp_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-algorithms-ddp_comm_hooks-test_ddp_hooks_qu0_b3y_) 2022-11-23T02:12:39.8584310Z 2022-11-23T02:12:39.8584444Z Running tests... 2022-11-23T02:12:39.8585166Z ---------------------------------------------------------------------- 2022-11-23T02:12:39.8586267Z Test results will be stored in test-reports/python-unittest/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks 2022-11-23T02:12:39.8586839Z test_ddp_comm_hook_allreduce_hook (__main__.DistributedDataParallelCommHookTest) 2022-11-23T02:12:39.8587602Z This unit test verifies the ``allreduce`` hook registered case gives same result ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:39.8589856Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45770 2022-11-23T02:12:39.8590341Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45771 2022-11-23T02:12:39.8591040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8591505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8592089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8592575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8593217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8593685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8594256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8594739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8595497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:39.8595985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:39.8596486Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmzw1jfos 2022-11-23T02:12:39.8597269Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmzw1jfos/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8597841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptct5m0l2 2022-11-23T02:12:39.8598387Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptct5m0l2/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8598758Z ok (5.368s) 2022-11-23T02:12:39.8599137Z test_ddp_comm_hook_fp16compress_hook (__main__.DistributedDataParallelCommHookTest) 2022-11-23T02:12:39.8599718Z This unit test verifies the ``fp16 compress`` hook registered case ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45854 2022-11-23T02:12:39.8600301Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45855 2022-11-23T02:12:39.8600921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8601384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8601970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8602445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8603125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8603596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8604181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8604637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8605081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:39.8605562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:39.8606075Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuhp0tgcx 2022-11-23T02:12:39.8606605Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuhp0tgcx/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8607146Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgvrkx0ce 2022-11-23T02:12:39.8607686Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgvrkx0ce/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8608071Z ok (3.810s) 2022-11-23T02:12:39.8608418Z test_ddp_comm_hook_noop_hook (__main__.DistributedDataParallelCommHookTest) 2022-11-23T02:12:39.8609005Z This unit test verifies the ``noop`` hook registered case and a subsequent allreduce ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45938 2022-11-23T02:12:39.8609561Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45939 2022-11-23T02:12:39.8610169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8610627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8611210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8611690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8612260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8612712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8613290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8613742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8614184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:39.8614735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:39.8615239Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfll_k1x1 2022-11-23T02:12:39.8615768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfll_k1x1/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8616303Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnhrfin49 2022-11-23T02:12:39.8616846Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnhrfin49/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8617216Z ok (3.710s) 2022-11-23T02:12:39.8617607Z test_ddp_comm_hook_quantize_per_channel_hook (__main__.DistributedDataParallelCommHookTest) 2022-11-23T02:12:39.8618204Z This unit test verifies the ``quantize per channel`` hook registered case ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46022 2022-11-23T02:12:39.8618757Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46023 2022-11-23T02:12:39.8619364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8619821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8620458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8620945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8621517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8621963Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8622547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8623022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8623466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:39.8623928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:39.8624435Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy2w1bw8b 2022-11-23T02:12:39.8624981Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy2w1bw8b/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8625501Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp52ha5pd 2022-11-23T02:12:39.8626044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp52ha5pd/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8626435Z ok (3.710s) 2022-11-23T02:12:39.8626823Z test_ddp_comm_hook_quantize_per_tensor_hook (__main__.DistributedDataParallelCommHookTest) 2022-11-23T02:12:39.8627399Z This unit test verifies the ``quantize per tensor`` hook registered case ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46106 2022-11-23T02:12:39.8627939Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46107 2022-11-23T02:12:39.8628562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8629028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8629596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8630068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8630655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8631084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8631745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8632216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8632659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:39.8633123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:39.8633624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvyhzb4rr 2022-11-23T02:12:39.8634173Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvyhzb4rr/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8634708Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9l_tp4i4 2022-11-23T02:12:39.8635529Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9l_tp4i4/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8635921Z ok (3.811s) 2022-11-23T02:12:39.8636403Z test_is_last_hook (__main__.DistributedDataParallelCommHookTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46190 2022-11-23T02:12:39.8636970Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46191 2022-11-23T02:12:39.8637578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8638122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8638728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8639186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8639772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:39.8640220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:39.8640812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:39.8641267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:39.8641711Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:39.8642194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:39.8642680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp38tosym5 2022-11-23T02:12:39.8643227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp38tosym5/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8643765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxcd129yx 2022-11-23T02:12:39.8644301Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxcd129yx/_remote_module_non_scriptable.py 2022-11-23T02:12:39.8644669Z ok (6.116s) 2022-11-23T02:12:39.8644826Z 2022-11-23T02:12:39.8645112Z ---------------------------------------------------------------------- 2022-11-23T02:12:39.8645453Z Ran 6 tests in 26.525s 2022-11-23T02:12:39.8645619Z 2022-11-23T02:12:39.8645697Z OK 2022-11-23T02:12:39.8645834Z 2022-11-23T02:12:39.8645961Z Generating XML reports... 2022-11-23T02:12:39.8646689Z Generated XML report: test-reports/python-unittest/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks/TEST-DistributedDataParallelCommHookTest-20221123021212.xml 2022-11-23T02:12:39.8647140Z 2022-11-23T02:12:39.8647598Z ##[endgroup] 2022-11-23T02:12:39.8648274Z FINISHED PRINTING LOG FILE of distributed/algorithms/ddp_comm_hooks/test_ddp_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-algorithms-ddp_comm_hooks-test_ddp_hooks_qu0_b3y_) 2022-11-23T02:12:39.8648689Z 2022-11-23T02:12:39.8649027Z Running distributed/_shard/sharded_tensor/ops/test_matrix_ops ... [2022-11-23 02:12:39.857992] 2022-11-23T02:12:39.8649769Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:12:39.858268] 2022-11-23T02:13:10.4127209Z 2022-11-23T02:13:10.4129862Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T02:13:10.4131252Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_matrix_ops_gqddspz4) 2022-11-23T02:13:10.4131684Z 2022-11-23T02:13:10.4131808Z Running tests... 2022-11-23T02:13:10.4132325Z ---------------------------------------------------------------------- 2022-11-23T02:13:10.4133769Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops 2022-11-23T02:13:10.4134366Z test_sharded_tensor_contiguous (__main__.TestShardedTensorMatrixOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:13:10.4134900Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46309 2022-11-23T02:13:10.4135360Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46310 2022-11-23T02:13:10.4135797Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46311 2022-11-23T02:13:10.4137790Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46312 2022-11-23T02:13:10.4138749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4139260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4139868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4140330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4140938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4146307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4147551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4148551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4149625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4150578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4151669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4152713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4153355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4153826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4154397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4154876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4155748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4156214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4156692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4157160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4157564Z skip: Need at least 4 CUDA devices (4.021s) 2022-11-23T02:13:10.4158057Z test_sharded_tensor_layer_norm (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46445 2022-11-23T02:13:10.4158801Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46446 2022-11-23T02:13:10.4159255Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46447 2022-11-23T02:13:10.4159707Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46448 2022-11-23T02:13:10.4160329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4160790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4161379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4161845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4162433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4162893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4163476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4163931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4164603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4165068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4165632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4166104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4166691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4167141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4167708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4168182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4168624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4169111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4169567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4170029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4170430Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:10.4170929Z test_sharded_tensor_layer_norm_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46581 2022-11-23T02:13:10.4171502Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46582 2022-11-23T02:13:10.4171959Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46583 2022-11-23T02:13:10.4172412Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46584 2022-11-23T02:13:10.4173024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4173482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4174065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4174524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4175111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4175564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4176222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4176673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4177261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4177717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4178302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4178753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4179339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4179789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4180357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4180827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4181268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4181806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4182273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4182740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4183141Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:13:10.4183639Z test_sharded_tensor_masked_fill (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46717 2022-11-23T02:13:10.4184200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46718 2022-11-23T02:13:10.4184657Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46719 2022-11-23T02:13:10.4185107Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46720 2022-11-23T02:13:10.4185710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4186171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4186778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4187254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4187829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4188280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4188869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4189322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4189904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4190354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4190934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4191382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4191966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4192413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4192973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4193512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4193950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4194523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4194984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4195793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4196168Z skip: Need at least 4 CUDA devices (2.409s) 2022-11-23T02:13:10.4196689Z test_sharded_tensor_masked_fill_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46853 2022-11-23T02:13:10.4197255Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46854 2022-11-23T02:13:10.4197718Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46855 2022-11-23T02:13:10.4198151Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46856 2022-11-23T02:13:10.4198774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4199322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4199891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4200344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4200925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4201400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4201974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4202451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4203036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4203467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4204053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4204526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4205113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4205544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4206127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4206600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4207025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4207505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4207978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4208453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4208834Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:13:10.4209344Z test_sharded_tensor_softmax (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46989 2022-11-23T02:13:10.4209894Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46990 2022-11-23T02:13:10.4210351Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46991 2022-11-23T02:13:10.4210869Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46992 2022-11-23T02:13:10.4211484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4211940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4212514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4212992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4213578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4214029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4214589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4215065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4215647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4216074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4216709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4217189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4217776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4218201Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4218779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4219256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4219701Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4220159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4220627Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4221102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4221480Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:13:10.4221994Z test_sharded_tensor_transpose (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47125 2022-11-23T02:13:10.4222556Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47126 2022-11-23T02:13:10.4223008Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47127 2022-11-23T02:13:10.4223444Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47128 2022-11-23T02:13:10.4224058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4224511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4225082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4225560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4226148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4226598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4227159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4227694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4228280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4228726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4229291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4229756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4230336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4230764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4231342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4231807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4232249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4232708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4233176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4233699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4234089Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:10.4234609Z test_sharded_tensor_transpose_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47261 2022-11-23T02:13:10.4235551Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47262 2022-11-23T02:13:10.4236014Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47263 2022-11-23T02:13:10.4236445Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47264 2022-11-23T02:13:10.4237070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4237530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4238098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4238573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4239163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4239616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4240177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4240649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4241242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4241689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4242250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4242719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4243303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4243731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4244308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4244777Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4245323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4245784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4246248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4246722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4247106Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:10.4247612Z test_sharded_tensor_type_as (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47397 2022-11-23T02:13:10.4248166Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47398 2022-11-23T02:13:10.4248620Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47399 2022-11-23T02:13:10.4249047Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47400 2022-11-23T02:13:10.4249674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4250133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4250718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4251248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4251849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4252299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4252857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4253330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4253920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4254371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4254933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4255410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4255993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4256421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4257000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4257467Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4257907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4258372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4258838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4259310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4259711Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:10.4260196Z test_sharded_tensor_view (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47533 2022-11-23T02:13:10.4260747Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47534 2022-11-23T02:13:10.4261200Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47535 2022-11-23T02:13:10.4261629Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47536 2022-11-23T02:13:10.4262242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4262801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4263391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4263852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4264441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4264891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4265452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4265924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4266506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4266959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4267519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4267988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4268621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4269081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4269643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4270113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4270552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4271014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4271478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4271951Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4272346Z skip: Need at least 4 CUDA devices (2.510s) 2022-11-23T02:13:10.4272840Z test_sharded_tensor_view_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47669 2022-11-23T02:13:10.4273397Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47670 2022-11-23T02:13:10.4273849Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47671 2022-11-23T02:13:10.4274279Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47672 2022-11-23T02:13:10.4274898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4275671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4276259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4276714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4277300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4277749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4278309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4278778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4279361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4279902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4280468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4280938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4281518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:10.4281964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:10.4282520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:10.4282985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:10.4283429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:10.4283889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:10.4284373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:10.4284835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:10.4285231Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:10.4285411Z 2022-11-23T02:13:10.4285758Z ---------------------------------------------------------------------- 2022-11-23T02:13:10.4286108Z Ran 11 tests in 28.225s 2022-11-23T02:13:10.4286276Z 2022-11-23T02:13:10.4286391Z OK (skipped=11) 2022-11-23T02:13:10.4286550Z 2022-11-23T02:13:10.4286657Z Generating XML reports... 2022-11-23T02:13:10.4287340Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20221123021241.xml 2022-11-23T02:13:10.4287749Z 2022-11-23T02:13:10.4288188Z ##[endgroup] 2022-11-23T02:13:10.4288878Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_matrix_ops_gqddspz4) 2022-11-23T02:13:10.4289291Z 2022-11-23T02:13:10.4289557Z Running distributed/_tensor/test_common_rules ... [2022-11-23 02:13:10.412973] 2022-11-23T02:13:10.4290254Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_common_rules.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:10.413254] 2022-11-23T02:13:41.0624641Z 2022-11-23T02:13:41.0628817Z Expand the folded group to see the log file of distributed/_tensor/test_common_rules 2022-11-23T02:13:41.0630456Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_common_rules (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_common_rules_x7d5o8h4) 2022-11-23T02:13:41.0631133Z 2022-11-23T02:13:41.0631320Z Running tests... 2022-11-23T02:13:41.0634822Z ---------------------------------------------------------------------- 2022-11-23T02:13:41.0636161Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_common_rules 2022-11-23T02:13:41.0637023Z test_einop_basic_propagation (__main__.CommonRulesTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:13:41.0637514Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47840 2022-11-23T02:13:41.0637966Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47841 2022-11-23T02:13:41.0638426Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47842 2022-11-23T02:13:41.0638870Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47843 2022-11-23T02:13:41.0639525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0639971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0640567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0641318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0641923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0642364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0642956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0643446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0644014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0644478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0645064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0645551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0646123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0646588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0647291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0647777Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0648208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0648686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0649161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0649768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0650156Z skip: Need at least 4 CUDA devices (4.005s) 2022-11-23T02:13:41.0650618Z test_einop_errors (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47976 2022-11-23T02:13:41.0651128Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47977 2022-11-23T02:13:41.0651566Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47978 2022-11-23T02:13:41.0652007Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47979 2022-11-23T02:13:41.0652629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0653087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0653653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0654133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0654715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0655146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0655727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0656192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0656775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0657204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0657779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0658239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0658886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0659330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0659903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0660371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0660792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0661267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0661740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0662212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0662595Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:41.0663064Z test_einop_linearity (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48112 2022-11-23T02:13:41.0663581Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48113 2022-11-23T02:13:41.0664016Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48114 2022-11-23T02:13:41.0664513Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48115 2022-11-23T02:13:41.0665142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0665600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0666162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0666633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0667223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0667653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0668228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0668697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0669279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0669706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0670285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0670748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0671327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0671760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0672332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0672800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0673222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0673696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0674163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0674630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0675009Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:13:41.0675914Z test_einop_merge_sharding (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48248 2022-11-23T02:13:41.0676441Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48249 2022-11-23T02:13:41.0676877Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48250 2022-11-23T02:13:41.0677331Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48251 2022-11-23T02:13:41.0677952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0678404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0678964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0679436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0680022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0680458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0681034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0681498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0682156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0682599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0683180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0683647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0684229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0684664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0685241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0685710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0686137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0686613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0687077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0687546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0687924Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:41.0688407Z test_einop_multi_sharding_on_mesh_dim (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48384 2022-11-23T02:13:41.0688940Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48385 2022-11-23T02:13:41.0689377Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48386 2022-11-23T02:13:41.0689822Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48387 2022-11-23T02:13:41.0690440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0690899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0691462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0691930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0692510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0693027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0693593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0694063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0694651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0695080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0695697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0696171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0696751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0697185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0697760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0698226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0698704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0699195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0699655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0700124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0700503Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:41.0700985Z test_einop_pointwise_propagation (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48520 2022-11-23T02:13:41.0701520Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48521 2022-11-23T02:13:41.0701969Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48522 2022-11-23T02:13:41.0702397Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48523 2022-11-23T02:13:41.0703010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0703466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0704026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0704495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0705076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0705525Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0706093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0706561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0707146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0707576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0708152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0708619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0709198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0709624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0710273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0710740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0711176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0711638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0712099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0712564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0712941Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:13:41.0713452Z test_pointwise_enforce_sharding_multi_sharding_on_mesh_dim (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48656 2022-11-23T02:13:41.0714016Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48657 2022-11-23T02:13:41.0714465Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48658 2022-11-23T02:13:41.0714891Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48659 2022-11-23T02:13:41.0716456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0716955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0717521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0717994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0718573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0719016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0719586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0720055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0720634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0721084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0721647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0722113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0722689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0723116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0723694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0724158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0724596Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0725058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0725521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0725989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0726368Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:41.0726857Z test_pointwise_multi_sharding_on_mesh_dim (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48792 2022-11-23T02:13:41.0727393Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48793 2022-11-23T02:13:41.0727944Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48794 2022-11-23T02:13:41.0728373Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48795 2022-11-23T02:13:41.0728990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0729445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0730009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0730485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0731070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0731514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0732074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0732548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0733129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0733624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0734197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0734667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0735245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0735673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0736244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0736720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0737160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0737615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0738079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0738550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0738926Z skip: Need at least 4 CUDA devices (2.511s) 2022-11-23T02:13:41.0739413Z test_pointwise_rules_broadcasting (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48928 2022-11-23T02:13:41.0739943Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48929 2022-11-23T02:13:41.0740394Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48930 2022-11-23T02:13:41.0740826Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48931 2022-11-23T02:13:41.0741440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0741900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0742485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0742940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0743522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0743968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0744527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0745079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0745667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0746112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0746674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0747135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0747712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0748141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0748714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0749184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0749619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0750073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0750587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0751070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0751448Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:41.0751930Z test_pointwise_rules_suggestion (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49064 2022-11-23T02:13:41.0752463Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49065 2022-11-23T02:13:41.0752917Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49066 2022-11-23T02:13:41.0753353Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49067 2022-11-23T02:13:41.0753966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0754420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0755000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0755759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0756346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0756797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0757351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0757827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0758406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0758854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0759414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0759882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0760463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0760905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0761460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0761925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0762459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0762919Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0763381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0763851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0764245Z skip: Need at least 4 CUDA devices (2.510s) 2022-11-23T02:13:41.0764691Z test_reduction_rule (__main__.CommonRulesTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49200 2022-11-23T02:13:41.0765202Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49201 2022-11-23T02:13:41.0765650Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49202 2022-11-23T02:13:41.0766075Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49203 2022-11-23T02:13:41.0766690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0767147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0767798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0768270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0768856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0769304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0769864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0770330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0770916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0771360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0771921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0772394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0772974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:41.0773421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:41.0773978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:41.0774449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:41.0774892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:41.0775351Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:41.0775815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:41.0776292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:41.0776692Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:13:41.0776872Z 2022-11-23T02:13:41.0777148Z ---------------------------------------------------------------------- 2022-11-23T02:13:41.0777490Z Ran 11 tests in 28.310s 2022-11-23T02:13:41.0777658Z 2022-11-23T02:13:41.0777772Z OK (skipped=11) 2022-11-23T02:13:41.0777929Z 2022-11-23T02:13:41.0778038Z Generating XML reports... 2022-11-23T02:13:41.0778632Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_common_rules/TEST-CommonRulesTest-20221123021312.xml 2022-11-23T02:13:41.0779049Z 2022-11-23T02:13:41.0779508Z ##[endgroup] 2022-11-23T02:13:41.0780135Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_common_rules (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_common_rules_x7d5o8h4) 2022-11-23T02:13:41.0780476Z 2022-11-23T02:13:41.0780751Z Running distributed/fsdp/test_fsdp_comm ... [2022-11-23 02:13:41.062802] 2022-11-23T02:13:41.0781438Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:41.063191] 2022-11-23T02:14:16.4598625Z 2022-11-23T02:14:16.4599145Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_comm 2022-11-23T02:14:16.4600162Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hfh5alhg) 2022-11-23T02:14:16.4603848Z 2022-11-23T02:14:16.4604245Z Running tests... 2022-11-23T02:14:16.4604800Z ---------------------------------------------------------------------- 2022-11-23T02:14:16.4605407Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm 2022-11-23T02:14:16.4605935Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:14:16.4606787Z Tests FSDP's communication cost in terms of calls to collective ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:16.4607314Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49371 2022-11-23T02:14:16.4607897Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49372 2022-11-23T02:14:16.4609083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4610144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4611323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4612389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4613083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4613964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4615060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4616164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4617004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4617500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4618180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4618888Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4619419Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4619881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4620238Z dist init r=1, world=2 2022-11-23T02:14:16.4620500Z dist init r=0, world=2 2022-11-23T02:14:16.4620724Z ok (5.652s) 2022-11-23T02:14:16.4621161Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:14:16.4621914Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49454 2022-11-23T02:14:16.4622457Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49455 2022-11-23T02:14:16.4623228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4623684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4624269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4624751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4625325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4625778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4626357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4626811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4627268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4627777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4628444Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4629190Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4629729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4630205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4630571Z dist init r=1, world=2 2022-11-23T02:14:16.4630809Z dist init r=0, world=2 2022-11-23T02:14:16.4631054Z ok (4.013s) 2022-11-23T02:14:16.4631441Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:14:16.4632148Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49537 2022-11-23T02:14:16.4632689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49538 2022-11-23T02:14:16.4633304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4633751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4634337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4634813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4635960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4636396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4636987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4637462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4637921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4638412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4639085Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4639781Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4640289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4640766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4641252Z dist init r=0, world=2 2022-11-23T02:14:16.4641511Z dist init r=1, world=2 2022-11-23T02:14:16.4641736Z ok (4.012s) 2022-11-23T02:14:16.4642163Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:14:16.4642916Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49620 2022-11-23T02:14:16.4643440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49621 2022-11-23T02:14:16.4644053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4644506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4645091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4645555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4646143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4646593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4647233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4647718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4648173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4648673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4649324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4650018Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4650547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4651025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4651368Z dist init r=1, world=2 2022-11-23T02:14:16.4651642Z dist init r=0, world=2 2022-11-23T02:14:16.4651889Z ok (4.112s) 2022-11-23T02:14:16.4652259Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:14:16.4652967Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49703 2022-11-23T02:14:16.4653506Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49704 2022-11-23T02:14:16.4654121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4654565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4655150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4655627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4656197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4656648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4657232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4657700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4658139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4658712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4659379Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4660072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4660584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4661063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4662336Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4663137Z warnings.warn( 2022-11-23T02:14:16.4664355Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4665125Z warnings.warn( 2022-11-23T02:14:16.4665377Z dist init r=0, world=2 2022-11-23T02:14:16.4665632Z dist init r=1, world=2 2022-11-23T02:14:16.4665855Z ok (3.812s) 2022-11-23T02:14:16.4666287Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:14:16.4667043Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49786 2022-11-23T02:14:16.4667585Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49787 2022-11-23T02:14:16.4668181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4668638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4669223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4669699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4670271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4670721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4671306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4671756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4672211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4672716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4673377Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4674055Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4674582Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4675416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4676921Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4677705Z warnings.warn( 2022-11-23T02:14:16.4678846Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4679622Z warnings.warn( 2022-11-23T02:14:16.4679876Z dist init r=1, world=2 2022-11-23T02:14:16.4680132Z dist init r=0, world=2 2022-11-23T02:14:16.4680355Z ok (3.812s) 2022-11-23T02:14:16.4680739Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-11-23T02:14:16.4681533Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49869 2022-11-23T02:14:16.4682067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49870 2022-11-23T02:14:16.4682687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4683147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4683731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4684196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4684785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4685235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4685798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4686272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4686732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4687234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4687885Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4688584Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4689113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4689589Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4690843Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4691609Z warnings.warn( 2022-11-23T02:14:16.4692770Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4693607Z warnings.warn( 2022-11-23T02:14:16.4693864Z dist init r=0, world=2 2022-11-23T02:14:16.4694100Z dist init r=1, world=2 2022-11-23T02:14:16.4694343Z ok (3.811s) 2022-11-23T02:14:16.4694766Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-11-23T02:14:16.4695494Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49952 2022-11-23T02:14:16.4696034Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49953 2022-11-23T02:14:16.4696653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4697147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4697714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4698246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4698845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:16.4699276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:16.4699854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:16.4700325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:16.4700781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:16.4701270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:16.4701934Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4702634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:16.4703161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:16.4703617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:16.4704880Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4705669Z warnings.warn( 2022-11-23T02:14:16.4706834Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:16.4707602Z warnings.warn( 2022-11-23T02:14:16.4707838Z dist init r=0, world=2 2022-11-23T02:14:16.4708093Z dist init r=1, world=2 2022-11-23T02:14:16.4708333Z ok (3.812s) 2022-11-23T02:14:16.4708482Z 2022-11-23T02:14:16.4708741Z ---------------------------------------------------------------------- 2022-11-23T02:14:16.4709166Z Ran 8 tests in 33.037s 2022-11-23T02:14:16.4709331Z 2022-11-23T02:14:16.4709428Z OK 2022-11-23T02:14:16.4709564Z 2022-11-23T02:14:16.4709671Z Generating XML reports... 2022-11-23T02:14:16.4710273Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20221123021343.xml 2022-11-23T02:14:16.4710624Z 2022-11-23T02:14:16.4710977Z ##[endgroup] 2022-11-23T02:14:16.4711552Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hfh5alhg) 2022-11-23T02:14:16.4711902Z 2022-11-23T02:14:16.4712168Z Running distributed/test_c10d_common ... [2022-11-23 02:14:16.460109] 2022-11-23T02:14:16.4712863Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_common.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:16.460401] 2022-11-23T02:15:21.8926522Z 2022-11-23T02:15:21.8927052Z Expand the folded group to see the log file of distributed/test_c10d_common 2022-11-23T02:15:21.8927983Z ##[group]PRINTING LOG FILE of distributed/test_c10d_common (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_common_l2ws5rx4) 2022-11-23T02:15:21.8930617Z ]> 2022-11-23T02:15:21.8931946Z test_debug_level (__main__.CommTest) 2022-11-23T02:15:21.8932771Z , <__main__.ComputeBucketAssignmentTest testMethod=test_multi_limit_single_dtype>, <__main__.ComputeBucketAssignmentTest testMethod=test_single_limit_multi_dtype>, <__main__.ComputeBucketAssignmentTest testMethod=test_single_limit_single_dtype>]> 2022-11-23T02:15:21.8933679Z test_multi_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T02:15:21.8934136Z test_multi_limit_single_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T02:15:21.8934582Z test_single_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T02:15:21.8934999Z test_single_limit_single_dtype (__main__.ComputeBucketAssignmentTest) 2022-11-23T02:15:21.8935870Z , <__main__.PythonProcessGroupExtensionTest testMethod=test_collectives>, <__main__.PythonProcessGroupExtensionTest testMethod=test_get_backend_name>, <__main__.PythonProcessGroupExtensionTest testMethod=test_send_recv>]> 2022-11-23T02:15:21.8936711Z test_backend_class_attr (__main__.PythonProcessGroupExtensionTest) 2022-11-23T02:15:21.8937530Z test_collectives (__main__.PythonProcessGroupExtensionTest) 2022-11-23T02:15:21.8938248Z test_get_backend_name (__main__.PythonProcessGroupExtensionTest) 2022-11-23T02:15:21.8939237Z test_send_recv (__main__.PythonProcessGroupExtensionTest) 2022-11-23T02:15:21.8940325Z , <__main__.ReduceOpTest testMethod=test_reduceop_copyable>, <__main__.ReduceOpTest testMethod=test_reduceop_pickle>]> 2022-11-23T02:15:21.8940933Z test_op_isinstance_of_reduceop (__main__.ReduceOpTest) 2022-11-23T02:15:21.8941278Z test_reduceop_copyable (__main__.ReduceOpTest) 2022-11-23T02:15:21.8941622Z test_reduceop_pickle (__main__.ReduceOpTest) 2022-11-23T02:15:21.8942557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8943205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8944419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8945406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8945753Z 2022-11-23T02:15:21.8945971Z Running tests... 2022-11-23T02:15:21.8946809Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8948135Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.8949003Z test_debug_level (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.8949727Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50103 2022-11-23T02:15:21.8950531Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50104 2022-11-23T02:15:21.8951658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8952545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8953519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8954159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8954750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8955849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8956428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8956901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8957460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:21.8957937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:21.8958289Z ok (3.960s) 2022-11-23T02:15:21.8958443Z 2022-11-23T02:15:21.8958725Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8959056Z Ran 1 test in 3.960s 2022-11-23T02:15:21.8959202Z 2022-11-23T02:15:21.8959298Z OK 2022-11-23T02:15:21.8959434Z 2022-11-23T02:15:21.8959561Z Generating XML reports... 2022-11-23T02:15:21.8960123Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-CommTest-20221123021420.xml 2022-11-23T02:15:21.8960783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8961240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8961823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8962294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8962507Z 2022-11-23T02:15:21.8962621Z Running tests... 2022-11-23T02:15:21.8963038Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8963575Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.8964085Z test_multi_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.8964474Z ok (1.637s) 2022-11-23T02:15:21.8964623Z 2022-11-23T02:15:21.8964891Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8965216Z Ran 1 test in 1.637s 2022-11-23T02:15:21.8965360Z 2022-11-23T02:15:21.8965454Z OK 2022-11-23T02:15:21.8965592Z 2022-11-23T02:15:21.8965717Z Generating XML reports... 2022-11-23T02:15:21.8966346Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021426.xml 2022-11-23T02:15:21.8967062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8967517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8968094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8968566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8968878Z 2022-11-23T02:15:21.8968990Z Running tests... 2022-11-23T02:15:21.8969400Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8969938Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.8970454Z test_multi_limit_single_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.8970844Z ok (1.650s) 2022-11-23T02:15:21.8970997Z 2022-11-23T02:15:21.8971264Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8971593Z Ran 1 test in 1.650s 2022-11-23T02:15:21.8971738Z 2022-11-23T02:15:21.8971836Z OK 2022-11-23T02:15:21.8971971Z 2022-11-23T02:15:21.8972098Z Generating XML reports... 2022-11-23T02:15:21.8972719Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021430.xml 2022-11-23T02:15:21.8973443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8973903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8974486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8975021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8975247Z 2022-11-23T02:15:21.8975359Z Running tests... 2022-11-23T02:15:21.8975774Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8976306Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.8976817Z test_single_limit_multi_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.8977205Z ok (1.623s) 2022-11-23T02:15:21.8977355Z 2022-11-23T02:15:21.8977621Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8977956Z Ran 1 test in 1.623s 2022-11-23T02:15:21.8978099Z 2022-11-23T02:15:21.8978195Z OK 2022-11-23T02:15:21.8978331Z 2022-11-23T02:15:21.8978458Z Generating XML reports... 2022-11-23T02:15:21.8979085Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021434.xml 2022-11-23T02:15:21.8979801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8980254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8980834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8981308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8981520Z 2022-11-23T02:15:21.8981631Z Running tests... 2022-11-23T02:15:21.8982035Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8982580Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.8983091Z test_single_limit_single_dtype (__main__.ComputeBucketAssignmentTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.8983480Z ok (1.640s) 2022-11-23T02:15:21.8983629Z 2022-11-23T02:15:21.8983897Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8984224Z Ran 1 test in 1.640s 2022-11-23T02:15:21.8984371Z 2022-11-23T02:15:21.8984465Z OK 2022-11-23T02:15:21.8984599Z 2022-11-23T02:15:21.8984723Z Generating XML reports... 2022-11-23T02:15:21.8985349Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021438.xml 2022-11-23T02:15:21.8986063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8986518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8987179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8987651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8987865Z 2022-11-23T02:15:21.8987978Z Running tests... 2022-11-23T02:15:21.8988385Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.8988922Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.8989435Z test_backend_class_attr (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.8989946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50342 2022-11-23T02:15:21.8990398Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50343 2022-11-23T02:15:21.8990844Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50344 2022-11-23T02:15:21.8991273Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50345 2022-11-23T02:15:21.8991881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8992337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8992956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8993445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8994033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8994482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8995551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8996072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8996663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8997111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8997673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.8998145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.8998725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.8999154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.8999771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9000239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9000682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:21.9001145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:21.9001609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:21.9002080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:21.9002413Z ok (4.063s) 2022-11-23T02:15:21.9002565Z 2022-11-23T02:15:21.9002839Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9003174Z Ran 1 test in 4.063s 2022-11-23T02:15:21.9003336Z 2022-11-23T02:15:21.9003431Z OK 2022-11-23T02:15:21.9003548Z 2022-11-23T02:15:21.9003675Z Generating XML reports... 2022-11-23T02:15:21.9004321Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021442.xml 2022-11-23T02:15:21.9005175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9005616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9006195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9006676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9006908Z 2022-11-23T02:15:21.9007019Z Running tests... 2022-11-23T02:15:21.9007408Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9007942Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.9008470Z test_collectives (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.9008957Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50513 2022-11-23T02:15:21.9009412Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50514 2022-11-23T02:15:21.9009852Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50515 2022-11-23T02:15:21.9010298Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50516 2022-11-23T02:15:21.9010967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9011441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9012025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9012501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9013063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9013515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9014095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9014550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9015135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9015583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9016160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9016610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9017189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9017635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9018199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9018667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9019104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:21.9019584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:21.9020211Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T02:15:21.9020880Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T02:15:21.9021383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:21.9021856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:21.9022401Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:15:21.9022897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:21.9023390Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:15:21.9023868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:21.9024536Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9025234Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9025926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9026602Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9026995Z ok (6.061s) 2022-11-23T02:15:21.9027146Z 2022-11-23T02:15:21.9027420Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9027751Z Ran 1 test in 6.061s 2022-11-23T02:15:21.9027895Z 2022-11-23T02:15:21.9028048Z OK 2022-11-23T02:15:21.9028195Z 2022-11-23T02:15:21.9028322Z Generating XML reports... 2022-11-23T02:15:21.9028972Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021448.xml 2022-11-23T02:15:21.9029703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9030160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9030749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9031227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9031441Z 2022-11-23T02:15:21.9031553Z Running tests... 2022-11-23T02:15:21.9031959Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9032501Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.9033022Z test_get_backend_name (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.9033536Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50693 2022-11-23T02:15:21.9033990Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50694 2022-11-23T02:15:21.9034436Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50695 2022-11-23T02:15:21.9034860Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50696 2022-11-23T02:15:21.9035979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9036437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9037002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9037488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9038076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9038526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9039085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9039556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9040140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9040723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9041285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9041757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9042343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9042771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9043346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9043809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9044252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:21.9044716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:21.9045181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:21.9045659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:21.9045988Z ok (4.038s) 2022-11-23T02:15:21.9046210Z 2022-11-23T02:15:21.9046498Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9046837Z Ran 1 test in 4.039s 2022-11-23T02:15:21.9047000Z 2022-11-23T02:15:21.9047095Z OK 2022-11-23T02:15:21.9047213Z 2022-11-23T02:15:21.9047339Z Generating XML reports... 2022-11-23T02:15:21.9047987Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021457.xml 2022-11-23T02:15:21.9048735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9049179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9049762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9050229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9050461Z 2022-11-23T02:15:21.9050574Z Running tests... 2022-11-23T02:15:21.9050963Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9051498Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.9052019Z test_send_recv (__main__.PythonProcessGroupExtensionTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.9052500Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50864 2022-11-23T02:15:21.9052952Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50865 2022-11-23T02:15:21.9053401Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50866 2022-11-23T02:15:21.9053845Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50867 2022-11-23T02:15:21.9054449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9054906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9055489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9055963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9056530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9056978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9057551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9058075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9058658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9059105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9059689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9060139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9060716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9061159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9061717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9062189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9062628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:21.9063274Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T02:15:21.9063818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:21.9064467Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T02:15:21.9064974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:21.9065612Z [W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:6789 (errno: 99 - Cannot assign requested address). 2022-11-23T02:15:21.9066097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:21.9066590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:15:21.9067088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:21.9067567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:15:21.9068233Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9068773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:21.9069429Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9070105Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9070801Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:15:21.9071194Z ok (6.044s) 2022-11-23T02:15:21.9071344Z 2022-11-23T02:15:21.9071616Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9071930Z Ran 1 test in 6.045s 2022-11-23T02:15:21.9072093Z 2022-11-23T02:15:21.9072192Z OK 2022-11-23T02:15:21.9072326Z 2022-11-23T02:15:21.9072454Z Generating XML reports... 2022-11-23T02:15:21.9073078Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021503.xml 2022-11-23T02:15:21.9073827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9074282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9074861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9075925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9076161Z 2022-11-23T02:15:21.9076272Z Running tests... 2022-11-23T02:15:21.9076694Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9077221Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.9077720Z test_op_isinstance_of_reduceop (__main__.ReduceOpTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.9078081Z ok (1.646s) 2022-11-23T02:15:21.9078230Z 2022-11-23T02:15:21.9078497Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9078808Z Ran 1 test in 1.646s 2022-11-23T02:15:21.9078969Z 2022-11-23T02:15:21.9079064Z OK 2022-11-23T02:15:21.9079198Z 2022-11-23T02:15:21.9079323Z Generating XML reports... 2022-11-23T02:15:21.9079871Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123021511.xml 2022-11-23T02:15:21.9080563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9081019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9081698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9082170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9082401Z 2022-11-23T02:15:21.9082512Z Running tests... 2022-11-23T02:15:21.9082919Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9083437Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.9083920Z test_reduceop_copyable (__main__.ReduceOpTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.9084269Z ok (1.661s) 2022-11-23T02:15:21.9084424Z 2022-11-23T02:15:21.9084690Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9085001Z Ran 1 test in 1.661s 2022-11-23T02:15:21.9085163Z 2022-11-23T02:15:21.9085258Z OK 2022-11-23T02:15:21.9085393Z 2022-11-23T02:15:21.9085519Z Generating XML reports... 2022-11-23T02:15:21.9086064Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123021515.xml 2022-11-23T02:15:21.9086743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:21.9087194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:21.9087775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:21.9088233Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:21.9088464Z 2022-11-23T02:15:21.9088577Z Running tests... 2022-11-23T02:15:21.9088988Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9089506Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_common 2022-11-23T02:15:21.9089982Z test_reduceop_pickle (__main__.ReduceOpTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:21.9090335Z ok (1.640s) 2022-11-23T02:15:21.9090482Z 2022-11-23T02:15:21.9090752Z ---------------------------------------------------------------------- 2022-11-23T02:15:21.9091061Z Ran 1 test in 1.640s 2022-11-23T02:15:21.9091222Z 2022-11-23T02:15:21.9091317Z OK 2022-11-23T02:15:21.9091452Z 2022-11-23T02:15:21.9091577Z Generating XML reports... 2022-11-23T02:15:21.9092125Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123021519.xml 2022-11-23T02:15:21.9092453Z 2022-11-23T02:15:21.9092869Z ##[endgroup] 2022-11-23T02:15:21.9093447Z FINISHED PRINTING LOG FILE of distributed/test_c10d_common (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_common_l2ws5rx4) 2022-11-23T02:15:21.9093880Z 2022-11-23T02:15:21.9094182Z Running distributed/fsdp/test_fsdp_freezing_weights ... [2022-11-23 02:15:21.893412] 2022-11-23T02:15:21.9094888Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_freezing_weights.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:15:21.893708] 2022-11-23T02:15:59.8068488Z 2022-11-23T02:15:59.8069297Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_freezing_weights 2022-11-23T02:15:59.8070313Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_freezing_weights (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_freezing_weights_ma0ltuul) 2022-11-23T02:15:59.8070696Z 2022-11-23T02:15:59.8070812Z Running tests... 2022-11-23T02:15:59.8071347Z ---------------------------------------------------------------------- 2022-11-23T02:15:59.8071937Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights 2022-11-23T02:15:59.8072598Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:59.8075862Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51146 2022-11-23T02:15:59.8076593Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51147 2022-11-23T02:15:59.8077325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8077787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8078362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8078849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8079450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8079913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8080492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8080976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8081435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8081938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8082593Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8083302Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8083837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8084319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8084795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8085291Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8085980Z dist init r=1, world=2 2022-11-23T02:15:59.8086223Z dist init r=0, world=2 2022-11-23T02:15:59.8086466Z ok (5.868s) 2022-11-23T02:15:59.8087052Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51229 2022-11-23T02:15:59.8087770Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51230 2022-11-23T02:15:59.8088550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8088993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8089579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8090057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8090631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8091136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8091733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8092216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8095482Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8096278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8097552Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8099083Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8100269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8101235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8102305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8102914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8104239Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8105111Z warnings.warn( 2022-11-23T02:15:59.8106298Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8107089Z warnings.warn( 2022-11-23T02:15:59.8107329Z dist init r=0, world=2 2022-11-23T02:15:59.8107583Z dist init r=1, world=2 2022-11-23T02:15:59.8107825Z ok (4.211s) 2022-11-23T02:15:59.8108388Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51312 2022-11-23T02:15:59.8109058Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51313 2022-11-23T02:15:59.8109690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8110148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8110724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8111202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8111898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8112350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8112914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8113389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8113855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8114343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8115465Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8116324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8118704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8119201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8119682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8120305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8120694Z dist init r=0, world=2 2022-11-23T02:15:59.8120932Z dist init r=1, world=2 2022-11-23T02:15:59.8121173Z ok (4.112s) 2022-11-23T02:15:59.8121749Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51395 2022-11-23T02:15:59.8122426Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51396 2022-11-23T02:15:59.8123072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8123540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8124133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8124623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8125205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8125686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8126294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8126787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8127237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8127762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8128424Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8129129Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8129669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8130163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8130655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8131150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8132418Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8133314Z warnings.warn( 2022-11-23T02:15:59.8134481Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8135255Z warnings.warn( 2022-11-23T02:15:59.8135514Z dist init r=0, world=2 2022-11-23T02:15:59.8135752Z dist init r=1, world=2 2022-11-23T02:15:59.8135994Z ok (4.211s) 2022-11-23T02:15:59.8136553Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51478 2022-11-23T02:15:59.8137264Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51479 2022-11-23T02:15:59.8137885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8138341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8138922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8139379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8139969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8140419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8140995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8141453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8141914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8142464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8143124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8143821Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8144336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8144809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8145293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8145766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8146129Z dist init r=1, world=2 2022-11-23T02:15:59.8146382Z dist init r=0, world=2 2022-11-23T02:15:59.8146605Z ok (4.311s) 2022-11-23T02:15:59.8147167Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51561 2022-11-23T02:15:59.8147815Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51562 2022-11-23T02:15:59.8148430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8148939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8149524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8150002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8150588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8151019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8151597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8152065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8152505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8153009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8153671Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8154427Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8154950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8156098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8156580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8157067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8158336Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8159130Z warnings.warn( 2022-11-23T02:15:59.8160288Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8161051Z warnings.warn( 2022-11-23T02:15:59.8161307Z dist init r=0, world=2 2022-11-23T02:15:59.8161542Z dist init r=1, world=2 2022-11-23T02:15:59.8161778Z ok (4.311s) 2022-11-23T02:15:59.8162344Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51674 2022-11-23T02:15:59.8162975Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51675 2022-11-23T02:15:59.8163597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8164052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8164631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8165086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8165669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8166228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8166808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8167261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8167719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8168216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8168862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8169558Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8170090Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8170567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8171038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8171597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8171974Z dist init r=0, world=2 2022-11-23T02:15:59.8172212Z dist init r=1, world=2 2022-11-23T02:15:59.8172451Z ok (4.311s) 2022-11-23T02:15:59.8173020Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51757 2022-11-23T02:15:59.8173672Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51758 2022-11-23T02:15:59.8174278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8174738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8175327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8175809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8176373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:59.8176818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:59.8177399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:59.8177853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:59.8178308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:59.8178811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:59.8179474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8180156Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:59.8180685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:59.8181161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:59.8181644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8182112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:15:59.8183395Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8184239Z warnings.warn( 2022-11-23T02:15:59.8185399Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:15:59.8186171Z warnings.warn( 2022-11-23T02:15:59.8186411Z dist init r=0, world=2 2022-11-23T02:15:59.8186664Z dist init r=1, world=2 2022-11-23T02:15:59.8186902Z ok (4.211s) 2022-11-23T02:15:59.8187035Z 2022-11-23T02:15:59.8187311Z ---------------------------------------------------------------------- 2022-11-23T02:15:59.8187644Z Ran 8 tests in 35.548s 2022-11-23T02:15:59.8187804Z 2022-11-23T02:15:59.8187897Z OK 2022-11-23T02:15:59.8188033Z 2022-11-23T02:15:59.8188215Z Generating XML reports... 2022-11-23T02:15:59.8188870Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20221123021523.xml 2022-11-23T02:15:59.8189245Z 2022-11-23T02:15:59.8189621Z ##[endgroup] 2022-11-23T02:15:59.8190252Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_freezing_weights (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_freezing_weights_ma0ltuul) 2022-11-23T02:15:59.8190637Z 2022-11-23T02:15:59.8190914Z Running distributed/_tensor/test_device_mesh ... [2022-11-23 02:15:59.807031] 2022-11-23T02:15:59.8191610Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_device_mesh.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:15:59.807298] 2022-11-23T02:16:53.5102910Z 2022-11-23T02:16:53.5103464Z Expand the folded group to see the log file of distributed/_tensor/test_device_mesh 2022-11-23T02:16:53.5107518Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_device_mesh (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_device_mesh_huznf82i) 2022-11-23T02:16:53.5107923Z 2022-11-23T02:16:53.5108025Z Running tests... 2022-11-23T02:16:53.5108562Z ---------------------------------------------------------------------- 2022-11-23T02:16:53.5109133Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_device_mesh 2022-11-23T02:16:53.5109656Z test_all_gather_1d (__main__.DeviceMeshCollectiveTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:16:53.5112409Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51905 2022-11-23T02:16:53.5112883Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51906 2022-11-23T02:16:53.5113337Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51907 2022-11-23T02:16:53.5113788Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51908 2022-11-23T02:16:53.5114223Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 51909 2022-11-23T02:16:53.5116613Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 51910 2022-11-23T02:16:53.5117103Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 51911 2022-11-23T02:16:53.5117556Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 51912 2022-11-23T02:16:53.5118620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5119198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5120059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5120572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5121177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5121619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5122205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5122694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5123285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5123724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5124316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5124793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5125405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5125991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5126809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5127286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5127858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5128567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5129174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5129798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5130667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5131156Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5131752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5132224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5132801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5133264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5133847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5134308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5135173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5135632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5136233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5136713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5137149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5137641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5138122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5138679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5139161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5139635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5140108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5140557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5140951Z skip: Need at least 8 CUDA devices (4.129s) 2022-11-23T02:16:53.5141450Z test_all_gather_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52177 2022-11-23T02:16:53.5141974Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52178 2022-11-23T02:16:53.5142429Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52179 2022-11-23T02:16:53.5142876Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52180 2022-11-23T02:16:53.5143330Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52181 2022-11-23T02:16:53.5143764Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52182 2022-11-23T02:16:53.5144266Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52183 2022-11-23T02:16:53.5144722Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52184 2022-11-23T02:16:53.5145329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5145787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5146376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5146854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5147428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5147883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5148463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5148939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5149513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5149967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5150547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5150998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5151589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5152039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5152614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5153071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5153654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5154104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5154667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5155498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5156100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5156653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5157221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5157694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5158284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5158737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5159297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5159769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5160353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5160789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5161373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5161835Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5162353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5162827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5163294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5163773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5164234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5164711Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5165183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5165648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5166025Z skip: Need at least 8 CUDA devices (2.515s) 2022-11-23T02:16:53.5166525Z test_all_gather_uneven (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52449 2022-11-23T02:16:53.5167068Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52450 2022-11-23T02:16:53.5167518Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52451 2022-11-23T02:16:53.5167946Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52452 2022-11-23T02:16:53.5168391Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52453 2022-11-23T02:16:53.5168843Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52454 2022-11-23T02:16:53.5169271Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52455 2022-11-23T02:16:53.5169715Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52456 2022-11-23T02:16:53.5170340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5170802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5171366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5171847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5172440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5172874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5173536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5174008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5174595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5175028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5175610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5176082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5176649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5177097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5177675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5178148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5178718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5179234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5179826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5180295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5180863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5181313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5181896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5182354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5182942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5183394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5183975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5184431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5185013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5185462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5186020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5186495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5186939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5187420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5187882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5188350Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5188825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5189297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5189743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5190208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5190669Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:16:53.5191140Z test_all_reduce_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52721 2022-11-23T02:16:53.5191677Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52722 2022-11-23T02:16:53.5192138Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52723 2022-11-23T02:16:53.5192595Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52724 2022-11-23T02:16:53.5193024Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52725 2022-11-23T02:16:53.5193462Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52726 2022-11-23T02:16:53.5193906Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52727 2022-11-23T02:16:53.5194328Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52728 2022-11-23T02:16:53.5194948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5195708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5196298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5196840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5197441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5197894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5198459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5198931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5199520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5199973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5200534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5201010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5201595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5202045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5202605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5203076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5203700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5204137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5204716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5205183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5205774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5206208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5206787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5207253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5207816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5208374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5208955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5209420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5209987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5210444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5211020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5211487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5211906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5212397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5212877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5213330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5213851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5214329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5214801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5215246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5215639Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:16:53.5216131Z test_all_reduce_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52993 2022-11-23T02:16:53.5216661Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52994 2022-11-23T02:16:53.5217122Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52995 2022-11-23T02:16:53.5217565Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52996 2022-11-23T02:16:53.5218019Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52997 2022-11-23T02:16:53.5218447Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52998 2022-11-23T02:16:53.5218884Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52999 2022-11-23T02:16:53.5219328Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53000 2022-11-23T02:16:53.5219925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5220379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5220965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5221438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5222003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5222462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5223046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5223517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5224082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5224536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5225114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5225631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5226219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5226672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5227253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5227704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5228291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5228745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5229304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5229778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5230360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5230809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5231426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5231903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5232489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5232937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5233495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5233970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5234555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5234987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5235940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5236413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5236854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5237316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5237782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5238259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5238716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5239187Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5239649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5240115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5240493Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:16:53.5240977Z test_all_to_all_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53265 2022-11-23T02:16:53.5241516Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53266 2022-11-23T02:16:53.5241953Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53267 2022-11-23T02:16:53.5242502Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53268 2022-11-23T02:16:53.5242943Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 53269 2022-11-23T02:16:53.5243385Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 53270 2022-11-23T02:16:53.5243811Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 53271 2022-11-23T02:16:53.5244255Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53272 2022-11-23T02:16:53.5244879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5245317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5245900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5246375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5246967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5247398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5247983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5248528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5249128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5249557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5250141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5250611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5251179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5251631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5252208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5252676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5253242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5253696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5254274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5254744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5255309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5255768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5256349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5256801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5257392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5257842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5258424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5258873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5259462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5259978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5260542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5261010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5261458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5261941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5262400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5262865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5263341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5263795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5264393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5264863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5265259Z skip: Need at least 8 CUDA devices (2.514s) 2022-11-23T02:16:53.5265807Z test_all_to_all_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53537 2022-11-23T02:16:53.5266354Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53538 2022-11-23T02:16:53.5266809Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53539 2022-11-23T02:16:53.5267260Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53540 2022-11-23T02:16:53.5267689Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 53541 2022-11-23T02:16:53.5268131Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 53542 2022-11-23T02:16:53.5268583Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 53543 2022-11-23T02:16:53.5269008Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53544 2022-11-23T02:16:53.5269619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5270080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5270668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5271126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5271716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5272166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5272728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5273208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5273798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5274254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5274815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5275518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5276107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5276555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5277111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5277676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5278257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5278684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5279267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5279735Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5280315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5280742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5281316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5281792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5282359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5282808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5283458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5283937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5284504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5284955Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5285531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5286004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5286431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5286910Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5287383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5287840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5288298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5288769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5289235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5289685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5290081Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:16:53.5290567Z test_broadcast_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53809 2022-11-23T02:16:53.5291085Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53810 2022-11-23T02:16:53.5291541Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53811 2022-11-23T02:16:53.5291980Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53812 2022-11-23T02:16:53.5292424Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 53813 2022-11-23T02:16:53.5292848Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 53814 2022-11-23T02:16:53.5293296Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 53815 2022-11-23T02:16:53.5293746Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53816 2022-11-23T02:16:53.5294411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5294863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5295445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5295922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5296493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5296946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5297525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5297996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5298560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5299017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5299595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5300044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5300683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5301142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5301718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5302166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5302746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5303196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5303803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5304264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5304848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5305295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5305851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5306325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5306907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5307358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5307898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5308349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5308934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5309389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5309978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5310445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5310889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5311349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5311882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5312353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5312809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5313283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5313745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5314210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5314582Z skip: Need at least 8 CUDA devices (2.617s) 2022-11-23T02:16:53.5315357Z test_broadcast_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54081 2022-11-23T02:16:53.5315908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54082 2022-11-23T02:16:53.5316343Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54083 2022-11-23T02:16:53.5316795Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54084 2022-11-23T02:16:53.5317237Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54085 2022-11-23T02:16:53.5317761Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54086 2022-11-23T02:16:53.5318197Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54087 2022-11-23T02:16:53.5318633Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54088 2022-11-23T02:16:53.5319250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5319690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5320273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5320751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5321339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5321776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5322355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5322822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5323407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5323837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5324415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5324888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5325452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5325896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5326480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5326947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5327510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5327963Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5328542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5329087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5329671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5330117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5330694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5331143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5331727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5332177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5332758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5333213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5333797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5334245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5334861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5335337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5335777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5336257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5336712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5337182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5337659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5338110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5338573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5339029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5339421Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:16:53.5339900Z test_reduce_scatter_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54353 2022-11-23T02:16:53.5340442Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54354 2022-11-23T02:16:53.5340894Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54355 2022-11-23T02:16:53.5341339Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54356 2022-11-23T02:16:53.5341769Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54357 2022-11-23T02:16:53.5342206Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54358 2022-11-23T02:16:53.5342647Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54359 2022-11-23T02:16:53.5343073Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54360 2022-11-23T02:16:53.5343687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5344139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5344721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5345180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5345768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5346286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5346852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5347325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5347912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5348361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5348919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5349388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5349972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5350406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5350983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5351448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5352087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5352522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5353101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5353564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5354150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5354582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5355403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5355879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5356455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5356900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5357478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5357944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5358510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5358959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5359538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5359988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5360427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5360908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5361382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5361839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5362308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5362776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5363341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5363784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5364173Z skip: Need at least 8 CUDA devices (2.815s) 2022-11-23T02:16:53.5364674Z test_reduce_scatter_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54625 2022-11-23T02:16:53.5365205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54626 2022-11-23T02:16:53.5365659Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54627 2022-11-23T02:16:53.5366106Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54628 2022-11-23T02:16:53.5366550Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54629 2022-11-23T02:16:53.5366983Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54630 2022-11-23T02:16:53.5367430Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54631 2022-11-23T02:16:53.5367870Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54632 2022-11-23T02:16:53.5368473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5368998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5369594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5370067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5370631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5371085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5371658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5390711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5391362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5391823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5392422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5392908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5393476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5393930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5394504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5394985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5395782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5396233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5396813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5397266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5397854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5398300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5398872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5399485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5400075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5400521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5401079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5401545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5402127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5402571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5403126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5403640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5404087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5404565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5405024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5405576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5406063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5406514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5406982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5407445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5407844Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:16:53.5408326Z test_reduce_scatter_uneven (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54897 2022-11-23T02:16:53.5408869Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54898 2022-11-23T02:16:53.5409329Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54899 2022-11-23T02:16:53.5409765Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54900 2022-11-23T02:16:53.5410213Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54901 2022-11-23T02:16:53.5410658Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54902 2022-11-23T02:16:53.5411109Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54903 2022-11-23T02:16:53.5411537Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54904 2022-11-23T02:16:53.5412160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5412617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5413181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5413657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5414302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5414752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5415311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5415780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5416363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5416884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5417442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5417912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5418494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5418919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5419487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5419949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5420527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5420956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5421526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5421986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5422600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5423056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5423633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5424094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5424655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5425112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5425838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5426303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5426872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5427321Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5427894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5428359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5428779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5429256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5429733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5430189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5430650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5431114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5431581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5432031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5432419Z skip: Need at least 8 CUDA devices (2.715s) 2022-11-23T02:16:53.5432902Z test_scatter_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55169 2022-11-23T02:16:53.5433417Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55170 2022-11-23T02:16:53.5433947Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55171 2022-11-23T02:16:53.5434396Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55172 2022-11-23T02:16:53.5434846Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 55173 2022-11-23T02:16:53.5435557Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 55174 2022-11-23T02:16:53.5436004Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 55175 2022-11-23T02:16:53.5436448Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 55176 2022-11-23T02:16:53.5437055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5437509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5438088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5438565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5439132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5439580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5440244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5440708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5441296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5441738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5442309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5442765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5443347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5443792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5444367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5444814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5445392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5445832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5446391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5446863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5447438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5447880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5448437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5448904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5449479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5449902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5450472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5450938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5451621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5452044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5452615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5453079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5453517Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5453979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5454437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5454908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5455366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5455835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5456302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5456824Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5457210Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:16:53.5457693Z test_scatter_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55441 2022-11-23T02:16:53.5458225Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55442 2022-11-23T02:16:53.5458660Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55443 2022-11-23T02:16:53.5459107Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55444 2022-11-23T02:16:53.5459558Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 55445 2022-11-23T02:16:53.5460006Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 55446 2022-11-23T02:16:53.5460435Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 55447 2022-11-23T02:16:53.5460879Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 55448 2022-11-23T02:16:53.5461500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5461938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5462518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5462993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5463576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5464011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5464583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5465052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5465636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5466066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5466642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5467106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5467664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5468180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5468755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5469214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5469779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5470232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5470798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5471229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5471804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5472276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5472866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5473315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5473955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5474415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5474991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5475672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5476258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5476701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5477267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5477730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5478173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5478653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5479112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5479573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5480043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5480492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5480959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5481423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5481814Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:16:53.5482288Z test_scatter_uneven (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55713 2022-11-23T02:16:53.5482823Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55714 2022-11-23T02:16:53.5483276Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55715 2022-11-23T02:16:53.5483707Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55716 2022-11-23T02:16:53.5484157Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 55717 2022-11-23T02:16:53.5484599Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 55718 2022-11-23T02:16:53.5485147Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 55719 2022-11-23T02:16:53.5485574Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 55720 2022-11-23T02:16:53.5486179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5486630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5487213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5487667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5488246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5488691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5489250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5489722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5490305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5490749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5491376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5491853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5492434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5492861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5493428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5493895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5494471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5494895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5495458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5495919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5496492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5496917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5497490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5497950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5498514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5498956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5499531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5499994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5500550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5500990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5501558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5502001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5502510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5502986Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5503490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5503949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5504413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5504875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5505336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5505780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5506165Z skip: Need at least 8 CUDA devices (2.715s) 2022-11-23T02:16:53.5506627Z test_device_mesh_2d (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55985 2022-11-23T02:16:53.5507122Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55986 2022-11-23T02:16:53.5507572Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55987 2022-11-23T02:16:53.5508078Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55988 2022-11-23T02:16:53.5508535Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 55989 2022-11-23T02:16:53.5508965Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 55990 2022-11-23T02:16:53.5509405Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 55991 2022-11-23T02:16:53.5509852Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 55992 2022-11-23T02:16:53.5510454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5510914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5511492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5511962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5512531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5512978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5513552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5514003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5514579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5515237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5515834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5516278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5516857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5517299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5517874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5518321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5518896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5519435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5519992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5520460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5521046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5521486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5522035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5522500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5523076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5523502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5524074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5524537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5525111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5525607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5526197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5526658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5527094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5527550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5528019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5528491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5528940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5529400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5529869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5530329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5530700Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:16:53.5531180Z test_device_mesh_2d_from_dim_groups (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56257 2022-11-23T02:16:53.5531711Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56258 2022-11-23T02:16:53.5532150Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56259 2022-11-23T02:16:53.5532596Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56260 2022-11-23T02:16:53.5533043Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 56261 2022-11-23T02:16:53.5533492Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 56262 2022-11-23T02:16:53.5533921Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 56263 2022-11-23T02:16:53.5534357Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 56264 2022-11-23T02:16:53.5534973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5535409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5535987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5536526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5537110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5537535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5538113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5538583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5539160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5539587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5540160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5540631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5541191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5541639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5542267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5542742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5543302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5543748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5544322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5544765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5545343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5545786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5546362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5546810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5547384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5547828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5548397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5548839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5549424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5549868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5550420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5550888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5551324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5551801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5552255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5552714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5553182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5553700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5554164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5554619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5555010Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:16:53.5555691Z test_device_mesh_dim_groups_error (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56529 2022-11-23T02:16:53.5556218Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56530 2022-11-23T02:16:53.5556670Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56531 2022-11-23T02:16:53.5557101Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56532 2022-11-23T02:16:53.5557552Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 56533 2022-11-23T02:16:53.5557990Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 56534 2022-11-23T02:16:53.5558435Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 56535 2022-11-23T02:16:53.5558861Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 56536 2022-11-23T02:16:53.5559553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5560016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5560582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5561049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5561630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5562084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5562647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5563111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5563691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5564133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5564682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5565130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5565710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5566172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5566765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5567232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5567811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5568235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5568796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5569249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5569810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5570226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5570878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5571318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5571870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5572323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5572893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5573353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5573910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5574354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5574918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5575356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5575793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5576320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5576804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5577256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5577721Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5578186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5578635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5579107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5579495Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:16:53.5579956Z test_device_mesh_nd (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56801 2022-11-23T02:16:53.5580454Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56802 2022-11-23T02:16:53.5580906Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56803 2022-11-23T02:16:53.5581355Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56804 2022-11-23T02:16:53.5581801Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 56805 2022-11-23T02:16:53.5582230Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 56806 2022-11-23T02:16:53.5582665Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 56807 2022-11-23T02:16:53.5583112Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 56808 2022-11-23T02:16:53.5583710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5584162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5584743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5585215Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5585780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5586221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5586795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5587312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5587893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5588332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5588904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5589350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5589931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5590377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5590930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5591387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5591972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5592413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5593017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5593491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5594065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5594502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5595261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5595740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5596327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5596751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5597323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5597790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5598367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:53.5598793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:53.5599362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:53.5599820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:53.5600243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:16:53.5600717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:16:53.5601183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:16:53.5601654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:16:53.5602111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:53.5602573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:53.5603033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:16:53.5603539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:16:53.5603914Z skip: Need at least 8 CUDA devices (2.514s) 2022-11-23T02:16:53.5604111Z 2022-11-23T02:16:53.5604494Z ---------------------------------------------------------------------- 2022-11-23T02:16:53.5604828Z Ran 19 tests in 51.297s 2022-11-23T02:16:53.5604995Z 2022-11-23T02:16:53.5605089Z OK (skipped=19) 2022-11-23T02:16:53.5605251Z 2022-11-23T02:16:53.5605378Z Generating XML reports... 2022-11-23T02:16:53.5606010Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshCollectiveTest-20221123021601.xml 2022-11-23T02:16:53.5606790Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshTest-20221123021601.xml 2022-11-23T02:16:53.5607110Z 2022-11-23T02:16:53.5607597Z ##[endgroup] 2022-11-23T02:16:53.5608217Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_device_mesh (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_device_mesh_huznf82i) 2022-11-23T02:16:53.5608570Z 2022-11-23T02:16:53.5608836Z Running distributed/test_pg_wrapper ... [2022-11-23 02:16:53.511325] 2022-11-23T02:16:53.5609527Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:16:53.511638] 2022-11-23T02:18:30.5972847Z 2022-11-23T02:18:30.5973377Z Expand the folded group to see the log file of distributed/test_pg_wrapper 2022-11-23T02:18:30.5978580Z ##[group]PRINTING LOG FILE of distributed/test_pg_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-test_pg_wrapper_dyxmt_4d) 2022-11-23T02:18:30.5979129Z 2022-11-23T02:18:30.5979465Z 2022-11-23T02:18:30.5980960Z , <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-11-23T02:18:30.5982435Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5982876Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5983345Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5983812Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5984313Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5984777Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5985326Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5985782Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5986264Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:18:30.5987306Z , <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-11-23T02:18:30.5988304Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:18:30.5988741Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:18:30.5989314Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:18:30.5989776Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:18:30.5990239Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:18:30.5990980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.5991449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.5992059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.5992548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.5992783Z 2022-11-23T02:18:30.5992879Z Running tests... 2022-11-23T02:18:30.5993298Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.5993853Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.5994377Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.5994869Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57141 2022-11-23T02:18:30.5996263Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57142 2022-11-23T02:18:30.5997023Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57143 2022-11-23T02:18:30.5998034Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57144 2022-11-23T02:18:30.5999261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6000117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6001160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6001967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6003039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6003839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6004894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6005747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6006769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6007845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6008526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6008987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6009569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6010045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6010623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6011100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6011539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6012000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6012470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6012934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6013840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6014329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6014827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:18:30.6015323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:18:30.6015990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6016670Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6017364Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6018065Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6018588Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:18:30.6019048Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:18:30.6019700Z [E ProcessGroupGloo.cpp:137] Rank 2 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-11-23T02:18:30.6020395Z [E ProcessGroupGloo.cpp:137] Rank 3 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-11-23T02:18:30.6020849Z ok (4.258s) 2022-11-23T02:18:30.6020982Z 2022-11-23T02:18:30.6021258Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6021607Z Ran 1 test in 4.258s 2022-11-23T02:18:30.6021778Z 2022-11-23T02:18:30.6021873Z OK 2022-11-23T02:18:30.6022010Z 2022-11-23T02:18:30.6022119Z Generating XML reports... 2022-11-23T02:18:30.6022750Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021657.xml 2022-11-23T02:18:30.6023492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6023948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6024515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6024990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6025226Z 2022-11-23T02:18:30.6025337Z Running tests... 2022-11-23T02:18:30.6025727Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6026271Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6026815Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6027328Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57348 2022-11-23T02:18:30.6027761Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57349 2022-11-23T02:18:30.6028203Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57350 2022-11-23T02:18:30.6028856Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57351 2022-11-23T02:18:30.6029572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6030032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6030609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6031167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6031737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6032190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6032776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6033241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6033809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6034253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6034827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6036023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6036629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6037075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6037656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6038205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6038661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6039141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6039772Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6040451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6040952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6041453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6041929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:18:30.6042420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:18:30.6043091Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6043782Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6044456Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6045147Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6045547Z ok (4.240s) 2022-11-23T02:18:30.6045699Z 2022-11-23T02:18:30.6045969Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6046284Z Ran 1 test in 4.240s 2022-11-23T02:18:30.6046451Z 2022-11-23T02:18:30.6046549Z OK 2022-11-23T02:18:30.6046683Z 2022-11-23T02:18:30.6046812Z Generating XML reports... 2022-11-23T02:18:30.6047430Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021703.xml 2022-11-23T02:18:30.6048170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6048624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6049208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6049790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6050024Z 2022-11-23T02:18:30.6050137Z Running tests... 2022-11-23T02:18:30.6050548Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6051065Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6051619Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6052142Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57555 2022-11-23T02:18:30.6052594Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57556 2022-11-23T02:18:30.6053022Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57557 2022-11-23T02:18:30.6053458Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57558 2022-11-23T02:18:30.6054070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6054514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6055094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6055564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6056212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6056650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6057229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6057698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6058260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6058711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6059293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6059761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6060332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6060784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6061360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6061827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6062249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6062749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6063230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6063681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6064069Z skip: Need at least 4 CUDA devices (4.057s) 2022-11-23T02:18:30.6064265Z 2022-11-23T02:18:30.6064545Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6064875Z Ran 1 test in 4.057s 2022-11-23T02:18:30.6065020Z 2022-11-23T02:18:30.6065131Z OK (skipped=1) 2022-11-23T02:18:30.6065287Z 2022-11-23T02:18:30.6065414Z Generating XML reports... 2022-11-23T02:18:30.6066045Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021710.xml 2022-11-23T02:18:30.6066768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6067290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6067874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6068348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6068561Z 2022-11-23T02:18:30.6068672Z Running tests... 2022-11-23T02:18:30.6069080Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6069621Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6070164Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6070697Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57726 2022-11-23T02:18:30.6071151Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57727 2022-11-23T02:18:30.6071604Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57728 2022-11-23T02:18:30.6072031Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57729 2022-11-23T02:18:30.6072641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6073151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6073733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6074207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6074790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6075811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6076386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6076864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6077443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6077890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6078456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6078925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6079509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6079936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6080508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6080975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6081417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6081880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6082349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6082819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6083194Z skip: Need at least 4 CUDA devices (4.052s) 2022-11-23T02:18:30.6083395Z 2022-11-23T02:18:30.6083672Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6084006Z Ran 1 test in 4.052s 2022-11-23T02:18:30.6084169Z 2022-11-23T02:18:30.6084278Z OK (skipped=1) 2022-11-23T02:18:30.6084417Z 2022-11-23T02:18:30.6084543Z Generating XML reports... 2022-11-23T02:18:30.6085311Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021716.xml 2022-11-23T02:18:30.6086048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6086485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6087070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6087543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6087776Z 2022-11-23T02:18:30.6087887Z Running tests... 2022-11-23T02:18:30.6088277Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6088818Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6089371Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6089902Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57897 2022-11-23T02:18:30.6090335Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57898 2022-11-23T02:18:30.6090775Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57899 2022-11-23T02:18:30.6091291Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57900 2022-11-23T02:18:30.6091904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6092357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6092935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6093407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6093978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6094431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6095009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6095469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6096054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6096500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6097075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6097528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6098107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6098557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6099132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6099580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6100023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6100500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6100952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6101415Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6101905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6102480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:18:30.6102958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6103451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:18:30.6104121Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6104799Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6105491Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6106177Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6106710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:18:30.6107242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:18:30.6107733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:18:30.6108274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:18:30.6108941Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6109610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6110296Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6110982Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6111377Z ok (4.426s) 2022-11-23T02:18:30.6111509Z 2022-11-23T02:18:30.6111778Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6112108Z Ran 1 test in 4.427s 2022-11-23T02:18:30.6112271Z 2022-11-23T02:18:30.6112367Z OK 2022-11-23T02:18:30.6112501Z 2022-11-23T02:18:30.6112614Z Generating XML reports... 2022-11-23T02:18:30.6113246Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021723.xml 2022-11-23T02:18:30.6113986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6114443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6115007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6115984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6116225Z 2022-11-23T02:18:30.6116338Z Running tests... 2022-11-23T02:18:30.6116738Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6117278Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6117817Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6118330Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58116 2022-11-23T02:18:30.6118766Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58117 2022-11-23T02:18:30.6119204Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58118 2022-11-23T02:18:30.6119648Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58119 2022-11-23T02:18:30.6120250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6120806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6121391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6121862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6122436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6122886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6123462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6123925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6124488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6124939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6125516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6125968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6126617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6127076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6127653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6128096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6128541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6129018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6129480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6129948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6130442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6130940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:18:30.6131415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6131903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:18:30.6132564Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6133258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6133935Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6134626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6135027Z ok (4.227s) 2022-11-23T02:18:30.6135178Z 2022-11-23T02:18:30.6135448Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6135766Z Ran 1 test in 4.227s 2022-11-23T02:18:30.6135930Z 2022-11-23T02:18:30.6136028Z OK 2022-11-23T02:18:30.6136165Z 2022-11-23T02:18:30.6136291Z Generating XML reports... 2022-11-23T02:18:30.6136907Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021729.xml 2022-11-23T02:18:30.6137645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6138170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6138754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6139207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6139443Z 2022-11-23T02:18:30.6139560Z Running tests... 2022-11-23T02:18:30.6139967Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6140488Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6141030Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6141549Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58323 2022-11-23T02:18:30.6142003Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58324 2022-11-23T02:18:30.6142439Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58325 2022-11-23T02:18:30.6142885Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58326 2022-11-23T02:18:30.6143499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6143989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6144585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6145059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6145647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6146076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6146655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6147130Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6147696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6148139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6148716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6149180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6149740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6150186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6150762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6151231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6151654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6152132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6152609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6153065Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6153455Z skip: Need at least 4 CUDA devices (4.073s) 2022-11-23T02:18:30.6153652Z 2022-11-23T02:18:30.6153929Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6154262Z Ran 1 test in 4.074s 2022-11-23T02:18:30.6154410Z 2022-11-23T02:18:30.6154520Z OK (skipped=1) 2022-11-23T02:18:30.6154677Z 2022-11-23T02:18:30.6154802Z Generating XML reports... 2022-11-23T02:18:30.6156033Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021736.xml 2022-11-23T02:18:30.6156755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6157209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6157794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6158266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6158477Z 2022-11-23T02:18:30.6158588Z Running tests... 2022-11-23T02:18:30.6158999Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6159536Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6160075Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6160609Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58494 2022-11-23T02:18:30.6161059Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58495 2022-11-23T02:18:30.6161506Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58496 2022-11-23T02:18:30.6162023Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58497 2022-11-23T02:18:30.6162651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6163107Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6163668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6164143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6164735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6165179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6165743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6166216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6166797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6167243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6167797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6168263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6168845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6169280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6169856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6170324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6170764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6171223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6171687Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6172157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6172535Z skip: Need at least 4 CUDA devices (3.964s) 2022-11-23T02:18:30.6172732Z 2022-11-23T02:18:30.6173102Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6173440Z Ran 1 test in 3.964s 2022-11-23T02:18:30.6173603Z 2022-11-23T02:18:30.6173718Z OK (skipped=1) 2022-11-23T02:18:30.6173855Z 2022-11-23T02:18:30.6173980Z Generating XML reports... 2022-11-23T02:18:30.6174612Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021742.xml 2022-11-23T02:18:30.6175350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6175785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6176364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6176834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6177066Z 2022-11-23T02:18:30.6177176Z Running tests... 2022-11-23T02:18:30.6177572Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6178106Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6178654Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6179238Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58665 2022-11-23T02:18:30.6179684Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58666 2022-11-23T02:18:30.6180126Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 58667 2022-11-23T02:18:30.6180569Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 58668 2022-11-23T02:18:30.6181163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6181616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6182202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6182678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6183245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6183695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6184273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6184731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6185313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6185758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6186341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6186793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6187376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6187826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6188384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6188851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6189290Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6189768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:18:30.6190219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:18:30.6190760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6191256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6191758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6192234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:18:30.6192723Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:18:30.6193383Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6194054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6194746Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6195958Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:18:30.6196500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:18:30.6197066Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:18:30.6197572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:18:30.6198059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:18:30.6198716Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6199384Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6200081Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6200766Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:18:30.6201160Z ok (4.330s) 2022-11-23T02:18:30.6201298Z 2022-11-23T02:18:30.6201571Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6201906Z Ran 1 test in 4.330s 2022-11-23T02:18:30.6202069Z 2022-11-23T02:18:30.6202165Z OK 2022-11-23T02:18:30.6202302Z 2022-11-23T02:18:30.6202411Z Generating XML reports... 2022-11-23T02:18:30.6203045Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021749.xml 2022-11-23T02:18:30.6203782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6204238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6204804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6205278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6205510Z 2022-11-23T02:18:30.6205624Z Running tests... 2022-11-23T02:18:30.6206017Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6206553Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6207115Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6207622Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58884 2022-11-23T02:18:30.6208057Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58885 2022-11-23T02:18:30.6208780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6209235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6209798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6210273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6210854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6211299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6211863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6212329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6212768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6213248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6213720Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6214216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6214936Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6215625Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6216154Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:18:30.6216629Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:18:30.6216978Z ok (3.926s) 2022-11-23T02:18:30.6217116Z 2022-11-23T02:18:30.6217389Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6217722Z Ran 1 test in 3.926s 2022-11-23T02:18:30.6217884Z 2022-11-23T02:18:30.6217980Z OK 2022-11-23T02:18:30.6218115Z 2022-11-23T02:18:30.6218224Z Generating XML reports... 2022-11-23T02:18:30.6218856Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021755.xml 2022-11-23T02:18:30.6219593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6220052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6220618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6221091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6221323Z 2022-11-23T02:18:30.6221438Z Running tests... 2022-11-23T02:18:30.6221828Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6222365Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6222905Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6223424Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58997 2022-11-23T02:18:30.6223860Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58998 2022-11-23T02:18:30.6224475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6224928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6225509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6226040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6226625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6227076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6227641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6228112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6228555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6229033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6229509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6230007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6230678Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6231373Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6231754Z ok (5.018s) 2022-11-23T02:18:30.6231905Z 2022-11-23T02:18:30.6232230Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6232573Z Ran 1 test in 5.019s 2022-11-23T02:18:30.6232738Z 2022-11-23T02:18:30.6232815Z OK 2022-11-23T02:18:30.6232950Z 2022-11-23T02:18:30.6233079Z Generating XML reports... 2022-11-23T02:18:30.6233714Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021801.xml 2022-11-23T02:18:30.6234454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6234897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6236020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6236497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6236733Z 2022-11-23T02:18:30.6236825Z Running tests... 2022-11-23T02:18:30.6237236Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6237775Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6238328Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6238839Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59126 2022-11-23T02:18:30.6239291Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59127 2022-11-23T02:18:30.6239911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6240344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6240930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6241402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6241987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6242420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6242997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6243463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6243886Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6244476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6244965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6245462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6246113Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6246810Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6247348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:18:30.6247842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:18:30.6248476Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:18:30.6249173Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:18:30.6249571Z ok (5.032s) 2022-11-23T02:18:30.6249723Z 2022-11-23T02:18:30.6250070Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6250396Z Ran 1 test in 5.033s 2022-11-23T02:18:30.6250559Z 2022-11-23T02:18:30.6250654Z OK 2022-11-23T02:18:30.6250789Z 2022-11-23T02:18:30.6250916Z Generating XML reports... 2022-11-23T02:18:30.6251530Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021809.xml 2022-11-23T02:18:30.6252267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6252724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6253318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6253780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6254012Z 2022-11-23T02:18:30.6254124Z Running tests... 2022-11-23T02:18:30.6254536Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6255055Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6255591Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6256103Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59265 2022-11-23T02:18:30.6256551Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59266 2022-11-23T02:18:30.6257149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6257611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6258194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6258652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6259236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6259681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6260260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6260710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6261152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6261648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6262191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6262676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6263342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6264039Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6264419Z ok (5.553s) 2022-11-23T02:18:30.6264569Z 2022-11-23T02:18:30.6264843Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6265174Z Ran 1 test in 5.553s 2022-11-23T02:18:30.6265338Z 2022-11-23T02:18:30.6265432Z OK 2022-11-23T02:18:30.6265550Z 2022-11-23T02:18:30.6265676Z Generating XML reports... 2022-11-23T02:18:30.6266308Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021816.xml 2022-11-23T02:18:30.6267049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6267481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6268116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6268598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6268829Z 2022-11-23T02:18:30.6268941Z Running tests... 2022-11-23T02:18:30.6269331Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6269866Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:18:30.6270420Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:30.6270934Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59395 2022-11-23T02:18:30.6271384Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59396 2022-11-23T02:18:30.6271991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6272452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6273013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6273485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6274066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:30.6274494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:30.6275399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:30.6275965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:30.6276412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:30.6276879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:30.6277370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:30.6277863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:30.6278535Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6279213Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:30.6279857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:18:30.6280352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:18:30.6280992Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:18:30.6281682Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:18:30.6282081Z ok (5.549s) 2022-11-23T02:18:30.6282232Z 2022-11-23T02:18:30.6282502Z ---------------------------------------------------------------------- 2022-11-23T02:18:30.6282816Z Ran 1 test in 5.549s 2022-11-23T02:18:30.6282979Z 2022-11-23T02:18:30.6283074Z OK 2022-11-23T02:18:30.6283210Z 2022-11-23T02:18:30.6283336Z Generating XML reports... 2022-11-23T02:18:30.6283952Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021824.xml 2022-11-23T02:18:30.6284338Z 2022-11-23T02:18:30.6284712Z ##[endgroup] 2022-11-23T02:18:30.6285289Z FINISHED PRINTING LOG FILE of distributed/test_pg_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-test_pg_wrapper_dyxmt_4d) 2022-11-23T02:18:30.6285627Z 2022-11-23T02:18:30.6285984Z Running distributed/fsdp/test_fsdp_comm_hooks ... [2022-11-23 02:18:30.598453] 2022-11-23T02:18:30.6286678Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:18:30.598852] 2022-11-23T02:20:13.7400397Z 2022-11-23T02:20:13.7400928Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T02:20:13.7401897Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hooks_3qugb7bv) 2022-11-23T02:20:13.7402275Z 2022-11-23T02:20:13.7402394Z Running tests... 2022-11-23T02:20:13.7405465Z ---------------------------------------------------------------------- 2022-11-23T02:20:13.7406042Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks 2022-11-23T02:20:13.7406678Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:20:13.7407246Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59535 2022-11-23T02:20:13.7409636Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59536 2022-11-23T02:20:13.7410362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7410893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7411488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7411970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7412562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7413026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7413617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7414080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7414544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7415055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7415748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7416699Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7417231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7417708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7418084Z dist init r=1, world=2 2022-11-23T02:20:13.7418327Z dist init r=0, world=2 2022-11-23T02:20:13.7418575Z ok (5.380s) 2022-11-23T02:20:13.7419090Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59618 2022-11-23T02:20:13.7419689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59619 2022-11-23T02:20:13.7420338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7420807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7421404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7421869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7422568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7423038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7423631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7424087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7424549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7425061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7425752Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7426457Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7426989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7427475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7427824Z dist init r=1, world=2 2022-11-23T02:20:13.7428085Z dist init r=0, world=2 2022-11-23T02:20:13.7428329Z ok (3.812s) 2022-11-23T02:20:13.7428836Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59701 2022-11-23T02:20:13.7429457Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59702 2022-11-23T02:20:13.7430095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7430553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7431134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7431610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7432192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7432639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7433200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7433671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7434208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7435011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7436100Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7436796Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7437324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7437784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7438145Z dist init r=1, world=2 2022-11-23T02:20:13.7438400Z dist init r=0, world=2 2022-11-23T02:20:13.7438641Z ok (3.812s) 2022-11-23T02:20:13.7439145Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59784 2022-11-23T02:20:13.7439748Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59785 2022-11-23T02:20:13.7440467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7440921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7441511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7441981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7442568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7442998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7443583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7444054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7444514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7445001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7445669Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7446364Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7446892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7447353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7447715Z dist init r=0, world=2 2022-11-23T02:20:13.7447969Z dist init r=1, world=2 2022-11-23T02:20:13.7448193Z ok (3.813s) 2022-11-23T02:20:13.7448705Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59867 2022-11-23T02:20:13.7449309Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59868 2022-11-23T02:20:13.7449929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7450367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7450950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7451419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7452107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7452539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7453117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7453586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7454025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7454525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7455186Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7455888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7456404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7456881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7457246Z dist init r=1, world=2 2022-11-23T02:20:13.7457484Z dist init r=0, world=2 2022-11-23T02:20:13.7457728Z ok (3.713s) 2022-11-23T02:20:13.7458308Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59950 2022-11-23T02:20:13.7458924Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59951 2022-11-23T02:20:13.7459530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7459986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7460570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7461048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7461613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7462068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7462648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7463098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7463556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7464055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7464720Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7465408Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7465939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7466416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7466760Z dist init r=0, world=2 2022-11-23T02:20:13.7467014Z dist init r=1, world=2 2022-11-23T02:20:13.7467257Z ok (3.811s) 2022-11-23T02:20:13.7467677Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7468413Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60033 2022-11-23T02:20:13.7468958Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60034 2022-11-23T02:20:13.7469640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7470079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7470662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7471144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7471729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7472159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7472739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7473208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7473673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7474155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7474819Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7475866Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7476388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7476866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7477220Z dist init r=0, world=2 2022-11-23T02:20:13.7477476Z dist init r=1, world=2 2022-11-23T02:20:13.7477698Z ok (3.813s) 2022-11-23T02:20:13.7478116Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7478876Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60116 2022-11-23T02:20:13.7479408Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60117 2022-11-23T02:20:13.7480028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7480487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7481070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7481528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7482115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7482572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7483135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7483606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7484069Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7484568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7485219Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7485915Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7486443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7487006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7487347Z dist init r=0, world=2 2022-11-23T02:20:13.7487603Z dist init r=1, world=2 2022-11-23T02:20:13.7487846Z ok (3.713s) 2022-11-23T02:20:13.7488249Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7489012Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60199 2022-11-23T02:20:13.7489559Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60200 2022-11-23T02:20:13.7490176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7490614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7491193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7491674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7492239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7492686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7493322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7493803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7494241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7494741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7495409Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7496112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7496622Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7497099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7497468Z dist init r=1, world=2 2022-11-23T02:20:13.7497707Z dist init r=0, world=2 2022-11-23T02:20:13.7497962Z ok (3.712s) 2022-11-23T02:20:13.7498440Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7499178Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60282 2022-11-23T02:20:13.7499701Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60283 2022-11-23T02:20:13.7500318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7500771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7501337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7501814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7502401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7502852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7503412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7503881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7504341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7504898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7505561Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7506261Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7506791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7507250Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7507848Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7508238Z return func(*args, **kwargs) 2022-11-23T02:20:13.7508778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7509162Z _check_comm_hook( 2022-11-23T02:20:13.7509680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7510156Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7510807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7511204Z traceback.print_stack() 2022-11-23T02:20:13.7511711Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7512099Z return func(*args, **kwargs) 2022-11-23T02:20:13.7512614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7513001Z _check_comm_hook( 2022-11-23T02:20:13.7513513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7513981Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7514546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7514933Z traceback.print_stack() 2022-11-23T02:20:13.7515807Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7516193Z return func(*args, **kwargs) 2022-11-23T02:20:13.7516726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7517117Z _check_comm_hook( 2022-11-23T02:20:13.7517612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7517996Z p_assert( 2022-11-23T02:20:13.7518476Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7518840Z traceback.print_stack() 2022-11-23T02:20:13.7519341Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7519719Z return func(*args, **kwargs) 2022-11-23T02:20:13.7520255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7520630Z _check_comm_hook( 2022-11-23T02:20:13.7521144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7521520Z p_assert( 2022-11-23T02:20:13.7521978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7522359Z traceback.print_stack() 2022-11-23T02:20:13.7522629Z dist init r=1, world=2 2022-11-23T02:20:13.7523018Z Communication hook should not be `None` 2022-11-23T02:20:13.7523349Z Communication hook state should not be `None` 2022-11-23T02:20:13.7523645Z dist init r=0, world=2 2022-11-23T02:20:13.7523912Z Communication hook should not be `None` 2022-11-23T02:20:13.7524239Z Communication hook state should not be `None` 2022-11-23T02:20:13.7524524Z ok (3.813s) 2022-11-23T02:20:13.7524985Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7525726Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60365 2022-11-23T02:20:13.7526252Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60366 2022-11-23T02:20:13.7526869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7527305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7527891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7528365Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7528951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7529453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7530048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7530518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7530979Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7531461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7532132Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7532826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7533338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7533819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7534416Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7534801Z return func(*args, **kwargs) 2022-11-23T02:20:13.7535322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7535714Z _check_comm_hook( 2022-11-23T02:20:13.7536228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7536690Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7537253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7537637Z traceback.print_stack() 2022-11-23T02:20:13.7538148Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7538517Z return func(*args, **kwargs) 2022-11-23T02:20:13.7539053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7539438Z _check_comm_hook( 2022-11-23T02:20:13.7539934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7540410Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7541043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7541429Z traceback.print_stack() 2022-11-23T02:20:13.7541914Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7542297Z return func(*args, **kwargs) 2022-11-23T02:20:13.7542837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7543214Z _check_comm_hook( 2022-11-23T02:20:13.7543728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7544107Z p_assert( 2022-11-23T02:20:13.7544580Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7544946Z return func(*args, **kwargs) 2022-11-23T02:20:13.7545445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7545835Z traceback.print_stack() 2022-11-23T02:20:13.7546356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7546746Z _check_comm_hook( 2022-11-23T02:20:13.7547318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7547685Z p_assert( 2022-11-23T02:20:13.7548158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7548542Z traceback.print_stack() 2022-11-23T02:20:13.7548813Z dist init r=0, world=2 2022-11-23T02:20:13.7549084Z Communication hook should not be `None` 2022-11-23T02:20:13.7549413Z Communication hook state should not be `None` 2022-11-23T02:20:13.7549710Z dist init r=1, world=2 2022-11-23T02:20:13.7549974Z Communication hook should not be `None` 2022-11-23T02:20:13.7550306Z Communication hook state should not be `None` 2022-11-23T02:20:13.7550588Z ok (3.711s) 2022-11-23T02:20:13.7551034Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7551796Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60448 2022-11-23T02:20:13.7552330Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60449 2022-11-23T02:20:13.7552944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7553380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7553966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7554447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7555243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7555712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7556297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7572291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7572773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7573265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7573972Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7574677Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7575373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7575836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7576446Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7576840Z return func(*args, **kwargs) 2022-11-23T02:20:13.7577366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7577762Z _check_comm_hook( 2022-11-23T02:20:13.7578278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7578756Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7579310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7579694Z traceback.print_stack() 2022-11-23T02:20:13.7580200Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7580569Z return func(*args, **kwargs) 2022-11-23T02:20:13.7581185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7581589Z _check_comm_hook( 2022-11-23T02:20:13.7582104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7582562Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7583125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7583505Z traceback.print_stack() 2022-11-23T02:20:13.7583997Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7584378Z return func(*args, **kwargs) 2022-11-23T02:20:13.7584910Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7585299Z _check_comm_hook( 2022-11-23T02:20:13.7585799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7586175Z p_assert( 2022-11-23T02:20:13.7586650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7587015Z traceback.print_stack() 2022-11-23T02:20:13.7587514Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7587896Z return func(*args, **kwargs) 2022-11-23T02:20:13.7588408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7588800Z _check_comm_hook( 2022-11-23T02:20:13.7589305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7589680Z p_assert( 2022-11-23T02:20:13.7590138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7590518Z traceback.print_stack() 2022-11-23T02:20:13.7590786Z dist init r=1, world=2 2022-11-23T02:20:13.7591052Z Communication hook should not be `None` 2022-11-23T02:20:13.7591378Z Communication hook state should not be `None` 2022-11-23T02:20:13.7591665Z dist init r=0, world=2 2022-11-23T02:20:13.7591928Z Communication hook should not be `None` 2022-11-23T02:20:13.7592251Z Communication hook state should not be `None` 2022-11-23T02:20:13.7592529Z ok (3.811s) 2022-11-23T02:20:13.7592963Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7593800Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60531 2022-11-23T02:20:13.7594326Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60532 2022-11-23T02:20:13.7594942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7595718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7596310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7596783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7597366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7597802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7598373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7598842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7599370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7599893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7600557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7601250Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7601758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7602240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7602833Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7603222Z return func(*args, **kwargs) 2022-11-23T02:20:13.7603707Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7604082Z return func(*args, **kwargs) 2022-11-23T02:20:13.7604616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7604990Z _check_comm_hook( 2022-11-23T02:20:13.7605501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7605972Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7606575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7606947Z _check_comm_hook( 2022-11-23T02:20:13.7607428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7607808Z traceback.print_stack() 2022-11-23T02:20:13.7608320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7608794Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7609353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7609731Z traceback.print_stack() 2022-11-23T02:20:13.7610213Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7610593Z return func(*args, **kwargs) 2022-11-23T02:20:13.7611263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7611637Z _check_comm_hook( 2022-11-23T02:20:13.7612120Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7612498Z return func(*args, **kwargs) 2022-11-23T02:20:13.7613007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7613383Z p_assert( 2022-11-23T02:20:13.7613855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7614233Z traceback.print_stack() 2022-11-23T02:20:13.7614749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7615133Z _check_comm_hook( 2022-11-23T02:20:13.7615641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7616004Z p_assert( 2022-11-23T02:20:13.7616473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7616852Z traceback.print_stack() 2022-11-23T02:20:13.7617105Z dist init r=1, world=2 2022-11-23T02:20:13.7617451Z Communication hook should not be `None` 2022-11-23T02:20:13.7617787Z Communication hook state should not be `None` 2022-11-23T02:20:13.7618077Z dist init r=0, world=2 2022-11-23T02:20:13.7618342Z Communication hook should not be `None` 2022-11-23T02:20:13.7618668Z Communication hook state should not be `None` 2022-11-23T02:20:13.7618948Z ok (3.811s) 2022-11-23T02:20:13.7619382Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7620131Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60614 2022-11-23T02:20:13.7620659Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60615 2022-11-23T02:20:13.7621256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7621710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7622298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7622770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7623339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7623789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7624367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7624841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7625278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7625782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7626442Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7627122Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7627651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7628132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7628727Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7629169Z return func(*args, **kwargs) 2022-11-23T02:20:13.7629708Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7630096Z _check_comm_hook( 2022-11-23T02:20:13.7630591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7631067Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7631631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7632015Z traceback.print_stack() 2022-11-23T02:20:13.7632499Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7632881Z return func(*args, **kwargs) 2022-11-23T02:20:13.7633415Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7633794Z _check_comm_hook( 2022-11-23T02:20:13.7634304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7634782Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7635691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7636074Z traceback.print_stack() 2022-11-23T02:20:13.7636576Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7636956Z return func(*args, **kwargs) 2022-11-23T02:20:13.7637468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7637857Z _check_comm_hook( 2022-11-23T02:20:13.7638378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7638756Z p_assert( 2022-11-23T02:20:13.7639210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7639595Z traceback.print_stack() 2022-11-23T02:20:13.7640096Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7640461Z return func(*args, **kwargs) 2022-11-23T02:20:13.7640987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7641374Z _check_comm_hook( 2022-11-23T02:20:13.7641904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7642279Z p_assert( 2022-11-23T02:20:13.7642750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7643131Z traceback.print_stack() 2022-11-23T02:20:13.7643383Z dist init r=1, world=2 2022-11-23T02:20:13.7643670Z Communication hook should not be `None` 2022-11-23T02:20:13.7643997Z Communication hook state should not be `None` 2022-11-23T02:20:13.7644273Z dist init r=0, world=2 2022-11-23T02:20:13.7644556Z Communication hook should not be `None` 2022-11-23T02:20:13.7644880Z Communication hook state should not be `None` 2022-11-23T02:20:13.7645144Z ok (3.711s) 2022-11-23T02:20:13.7645598Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7646355Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60697 2022-11-23T02:20:13.7646879Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60698 2022-11-23T02:20:13.7647586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7648043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7648623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7649082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7649665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7650109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7650684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7651135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7651590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7652096Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7652757Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7653500Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7654036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7654511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7655092Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7655481Z return func(*args, **kwargs) 2022-11-23T02:20:13.7655976Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7656354Z return func(*args, **kwargs) 2022-11-23T02:20:13.7656873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7657260Z _check_comm_hook( 2022-11-23T02:20:13.7657778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7658139Z _check_comm_hook( 2022-11-23T02:20:13.7658647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7659119Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7659702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:20:13.7660151Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:20:13.7660717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7661095Z traceback.print_stack() 2022-11-23T02:20:13.7661572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7661951Z traceback.print_stack() 2022-11-23T02:20:13.7662449Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7662831Z return func(*args, **kwargs) 2022-11-23T02:20:13.7663344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7663730Z _check_comm_hook( 2022-11-23T02:20:13.7664243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7664598Z p_assert( 2022-11-23T02:20:13.7665142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7665521Z traceback.print_stack() 2022-11-23T02:20:13.7666003Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:20:13.7666385Z return func(*args, **kwargs) 2022-11-23T02:20:13.7666920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:20:13.7667307Z _check_comm_hook( 2022-11-23T02:20:13.7667829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:20:13.7668206Z p_assert( 2022-11-23T02:20:13.7668676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:20:13.7669041Z traceback.print_stack() 2022-11-23T02:20:13.7669309Z dist init r=0, world=2 2022-11-23T02:20:13.7669598Z Communication hook should not be `None` 2022-11-23T02:20:13.7669908Z Communication hook state should not be `None` 2022-11-23T02:20:13.7670222Z dist init r=1, world=2 2022-11-23T02:20:13.7670506Z Communication hook should not be `None` 2022-11-23T02:20:13.7670815Z Communication hook state should not be `None` 2022-11-23T02:20:13.7671093Z ok (3.811s) 2022-11-23T02:20:13.7671666Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60780 2022-11-23T02:20:13.7672284Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60781 2022-11-23T02:20:13.7672883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7673337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7673919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7674398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7674962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7675701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7676291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7676741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7677195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7677697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7678352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7679037Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7679562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7680036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7680398Z dist init r=0, world=2 2022-11-23T02:20:13.7680633Z dist init r=1, world=2 2022-11-23T02:20:13.7680877Z ok (3.710s) 2022-11-23T02:20:13.7681388Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60863 2022-11-23T02:20:13.7681973Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60864 2022-11-23T02:20:13.7682590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7683182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7683770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7684224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7684810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7685259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7685815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7686284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7686733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7687236Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7687883Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7688578Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7689178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7689664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7690005Z dist init r=0, world=2 2022-11-23T02:20:13.7690259Z dist init r=1, world=2 2022-11-23T02:20:13.7690499Z ok (3.811s) 2022-11-23T02:20:13.7691000Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60946 2022-11-23T02:20:13.7691615Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60947 2022-11-23T02:20:13.7692235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7692674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7693261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7693731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7694312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7694741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7695312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7695775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7696234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7696721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7697386Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7698080Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7698586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7699058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7699413Z dist init r=1, world=2 2022-11-23T02:20:13.7699667Z dist init r=0, world=2 2022-11-23T02:20:13.7699890Z ok (3.811s) 2022-11-23T02:20:13.7700472Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61029 2022-11-23T02:20:13.7701078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61030 2022-11-23T02:20:13.7701683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7702136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7702713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7703186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7703752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7704200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7704781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7705245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7705681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7706242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7706918Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7707595Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7708124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7708646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7709013Z dist init r=0, world=2 2022-11-23T02:20:13.7709251Z dist init r=1, world=2 2022-11-23T02:20:13.7709491Z ok (3.810s) 2022-11-23T02:20:13.7710002Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61112 2022-11-23T02:20:13.7710621Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61113 2022-11-23T02:20:13.7711278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7711729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7712310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7712763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7713344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7713790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7714349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7714821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7715514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7716016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7716661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7717356Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7717975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7718448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7718793Z dist init r=1, world=2 2022-11-23T02:20:13.7719049Z dist init r=0, world=2 2022-11-23T02:20:13.7719290Z ok (3.810s) 2022-11-23T02:20:13.7719795Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61195 2022-11-23T02:20:13.7720430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61196 2022-11-23T02:20:13.7721054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7721506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7722071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7722541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7723120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7723619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7724206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7724671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7725120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7725606Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7726265Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7726992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7727521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7727983Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7728336Z dist init r=1, world=2 2022-11-23T02:20:13.7728587Z dist init r=0, world=2 2022-11-23T02:20:13.7728836Z ok (3.810s) 2022-11-23T02:20:13.7729233Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7729950Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61278 2022-11-23T02:20:13.7730483Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61279 2022-11-23T02:20:13.7731085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7731574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7732151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7732609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7733194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7733644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7734216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7734666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7735215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7735718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7736361Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7737218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7737745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7738216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7738556Z dist init r=0, world=2 2022-11-23T02:20:13.7738808Z dist init r=1, world=2 2022-11-23T02:20:13.7739047Z ok (3.310s) 2022-11-23T02:20:13.7739428Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7740149Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61357 2022-11-23T02:20:13.7740683Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61358 2022-11-23T02:20:13.7741357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7741805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7742384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7742862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7743446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7743872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7744453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7744918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7745357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7745864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7746522Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7747212Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7747721Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7748194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7748553Z dist init r=1, world=2 2022-11-23T02:20:13.7748791Z dist init r=0, world=2 2022-11-23T02:20:13.7749031Z ok (3.310s) 2022-11-23T02:20:13.7749435Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7750159Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61436 2022-11-23T02:20:13.7750681Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61437 2022-11-23T02:20:13.7751286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7751740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7752304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7752848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7753432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7753885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7754448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7754914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7755685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7756186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7756829Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7757529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7758048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7758509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7758865Z dist init r=0, world=2 2022-11-23T02:20:13.7759199Z dist init r=1, world=2 2022-11-23T02:20:13.7759434Z ok (3.310s) 2022-11-23T02:20:13.7759840Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7760563Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61515 2022-11-23T02:20:13.7761094Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61516 2022-11-23T02:20:13.7761688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7762146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7762726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7763197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7763764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7764213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7764789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7765237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7765688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7766200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7766860Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7767541Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7768066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7768538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7768896Z dist init r=1, world=2 2022-11-23T02:20:13.7769132Z dist init r=0, world=2 2022-11-23T02:20:13.7769370Z ok (3.210s) 2022-11-23T02:20:13.7769770Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7770472Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61594 2022-11-23T02:20:13.7771109Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61595 2022-11-23T02:20:13.7771723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7772182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7772745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7773218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7773798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7774229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7774803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7775275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7775732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7776272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7776943Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7777629Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7778152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7778609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7778959Z dist init r=0, world=2 2022-11-23T02:20:13.7779220Z dist init r=1, world=2 2022-11-23T02:20:13.7779443Z ok (3.311s) 2022-11-23T02:20:13.7779850Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:20:13.7780574Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61673 2022-11-23T02:20:13.7781093Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61674 2022-11-23T02:20:13.7781706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7782157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7782734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7783187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7783773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:13.7784219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:13.7784789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:13.7785238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:13.7785690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:20:13.7786188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:13.7786830Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7787522Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:20:13.7788123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:20:13.7788592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:13.7788934Z dist init r=1, world=2 2022-11-23T02:20:13.7789188Z dist init r=0, world=2 2022-11-23T02:20:13.7789428Z ok (3.310s) 2022-11-23T02:20:13.7789583Z 2022-11-23T02:20:13.7789843Z ---------------------------------------------------------------------- 2022-11-23T02:20:13.7790178Z Ran 27 tests in 100.773s 2022-11-23T02:20:13.7790342Z 2022-11-23T02:20:13.7790437Z OK 2022-11-23T02:20:13.7790574Z 2022-11-23T02:20:13.7790681Z Generating XML reports... 2022-11-23T02:20:13.7791308Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20221123021832.xml 2022-11-23T02:20:13.7791681Z 2022-11-23T02:20:13.7792159Z ##[endgroup] 2022-11-23T02:20:13.7792781Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hooks_3qugb7bv) 2022-11-23T02:20:13.7793150Z 2022-11-23T02:20:13.7793394Z Running distributed/test_c10d_pypg ... [2022-11-23 02:20:13.740961] 2022-11-23T02:20:13.7794127Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:20:13.741234] 2022-11-23T02:22:19.8250998Z 2022-11-23T02:22:19.8251489Z Expand the folded group to see the log file of distributed/test_c10d_pypg 2022-11-23T02:22:19.8255346Z ##[group]PRINTING LOG FILE of distributed/test_c10d_pypg (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_pypg_7o8u25rr) 2022-11-23T02:22:19.8255812Z 2022-11-23T02:22:19.8257792Z Running tests... 2022-11-23T02:22:19.8258788Z ---------------------------------------------------------------------- 2022-11-23T02:22:19.8259416Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_pypg 2022-11-23T02:22:19.8259914Z test_ddp_checkpointing_dynamic_module (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8260528Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:22:19.8261029Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61787 2022-11-23T02:22:19.8263228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8264011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8264641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8265126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8265556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8266074Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0bwzuohe 2022-11-23T02:22:19.8266627Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0bwzuohe/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8267018Z ok (5.177s) 2022-11-23T02:22:19.8267376Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8267954Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61824 2022-11-23T02:22:19.8268674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8269128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8269690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8270166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8270875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8271363Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp56953x13 2022-11-23T02:22:19.8271919Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp56953x13/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8272313Z ok (3.508s) 2022-11-23T02:22:19.8272678Z test_ddp_checkpointing_once_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8273224Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61861 2022-11-23T02:22:19.8273933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8274422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8275008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8275933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8276383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8277016Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptdt0ligt 2022-11-23T02:22:19.8277582Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptdt0ligt/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8278118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8278611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8279809Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8280593Z warnings.warn( 2022-11-23T02:22:19.8280971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8281446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8281926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8282404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8282755Z ok (3.609s) 2022-11-23T02:22:19.8283094Z test_ddp_checkpointing_once_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8283646Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61898 2022-11-23T02:22:19.8284353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8284792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8285376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8285850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8286288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8286778Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe7t52cwm 2022-11-23T02:22:19.8287321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe7t52cwm/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8287834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8288304Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8289576Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8290303Z warnings.warn( 2022-11-23T02:22:19.8290677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8291145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8291627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8292108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8292456Z ok (3.709s) 2022-11-23T02:22:19.8292821Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8293541Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61935 2022-11-23T02:22:19.8294291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8294752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8295317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8295789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8296228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8296724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk9yaiipo 2022-11-23T02:22:19.8297274Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk9yaiipo/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8297791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8298274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8298613Z ok (3.510s) 2022-11-23T02:22:19.8298992Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8299703Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61972 2022-11-23T02:22:19.8300382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8300834Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8301410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8301889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8302495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8303005Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpixfvno42 2022-11-23T02:22:19.8303545Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpixfvno42/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8304061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8304531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8304885Z ok (3.408s) 2022-11-23T02:22:19.8305247Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8305956Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62009 2022-11-23T02:22:19.8306737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8307186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8307772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8308228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8308664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8309165Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdybcfdla 2022-11-23T02:22:19.8309687Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdybcfdla/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8310211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8311293Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:22:19.8312289Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8312640Z ok (3.609s) 2022-11-23T02:22:19.8313005Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8313706Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62046 2022-11-23T02:22:19.8314421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8314877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8315914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8316375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8316819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8317322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwqv3ltek 2022-11-23T02:22:19.8317847Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwqv3ltek/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8318229Z ok (3.509s) 2022-11-23T02:22:19.8318584Z test_ddp_checkpointing_twice_weight_sharing (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8319156Z Checkpointing should work with static graph in the case of checkpointing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62083 2022-11-23T02:22:19.8319851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8320310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8320888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8321338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8321779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8322283Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1nmxtdr8 2022-11-23T02:22:19.8322820Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1nmxtdr8/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8323421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8323907Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8324257Z ok (3.509s) 2022-11-23T02:22:19.8324619Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8325201Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62120 2022-11-23T02:22:19.8325910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8326363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8326925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8327401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8327838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8328339Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn9z6nfrj 2022-11-23T02:22:19.8328936Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn9z6nfrj/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8330003Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:22:19.8331681Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8332404Z warnings.warn( 2022-11-23T02:22:19.8332783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8333253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8333604Z ok (3.509s) 2022-11-23T02:22:19.8333974Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8334558Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62157 2022-11-23T02:22:19.8335260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8335717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8336305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8336764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8337206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8337706Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4r59dw8d 2022-11-23T02:22:19.8338243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4r59dw8d/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8339422Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8340220Z warnings.warn( 2022-11-23T02:22:19.8340606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8341094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8341429Z ok (3.609s) 2022-11-23T02:22:19.8341803Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8342364Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62194 2022-11-23T02:22:19.8343052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8343490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8344068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8344536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8345007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8345522Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_nks66_4 2022-11-23T02:22:19.8346055Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_nks66_4/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8346566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8347034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8347391Z ok (3.609s) 2022-11-23T02:22:19.8347767Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-11-23T02:22:19.8348303Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62231 2022-11-23T02:22:19.8348989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8349441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8350021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8350482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8350921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8351423Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpue1ezoiy 2022-11-23T02:22:19.8351953Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpue1ezoiy/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8352471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8352959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8353446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8353911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8354260Z ok (3.609s) 2022-11-23T02:22:19.8354707Z test_ddp_invoke_work_object (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62268 2022-11-23T02:22:19.8355756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8356193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8356863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8357335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8357757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8358263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4z8ssen_ 2022-11-23T02:22:19.8358802Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4z8ssen_/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8359180Z ok (2.207s) 2022-11-23T02:22:19.8359594Z test_ddp_with_pypg (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62304 2022-11-23T02:22:19.8360285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8360742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8361302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8361770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8362274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8362789Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp61x_bay5 2022-11-23T02:22:19.8363309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp61x_bay5/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8363820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8364171Z ok (2.230s) 2022-11-23T02:22:19.8364608Z test_ddp_with_pypg_with_grad_views (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62340 2022-11-23T02:22:19.8365328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8365780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8366356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8366816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8367255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8367757Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe4f2j296 2022-11-23T02:22:19.8368293Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe4f2j296/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8368791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8369151Z ok (2.206s) 2022-11-23T02:22:19.8369597Z test_invalid_powerSGD_state (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62376 2022-11-23T02:22:19.8370290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8370743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8371318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8371783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8372203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8372995Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8374158Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8375228Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8376285Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8377410Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8378475Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8379080Z ok (2.206s) 2022-11-23T02:22:19.8379538Z test_sync_batch_norm_empty_input (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62410 2022-11-23T02:22:19.8380253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8380704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8381269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8381741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8382185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8382673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq5tgodb8 2022-11-23T02:22:19.8383215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq5tgodb8/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8383739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8384094Z ok (3.008s) 2022-11-23T02:22:19.8384535Z test_sync_batch_norm_only_empty_input (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62447 2022-11-23T02:22:19.8385257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8385706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8386288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8386745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8387185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8387687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv5lnljp8 2022-11-23T02:22:19.8388280Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv5lnljp8/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8388800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8389155Z ok (3.008s) 2022-11-23T02:22:19.8389504Z test_ddp_checkpointing_dynamic_module (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8390189Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62484 2022-11-23T02:22:19.8390898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8391351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8391911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8392389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8392827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8393334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr2txwap_ 2022-11-23T02:22:19.8393914Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr2txwap_/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8394316Z ok (3.509s) 2022-11-23T02:22:19.8394672Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8395589Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62521 2022-11-23T02:22:19.8396312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8396763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8397349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8397803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8398245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8398748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8rkj0zr8 2022-11-23T02:22:19.8399288Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8rkj0zr8/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8399652Z ok (3.609s) 2022-11-23T02:22:19.8400012Z test_ddp_checkpointing_once_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8400567Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62558 2022-11-23T02:22:19.8401245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8401704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8402282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8402752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8403173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8403677Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsfjg_i06 2022-11-23T02:22:19.8404212Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsfjg_i06/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8404707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8405193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8406461Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8407179Z warnings.warn( 2022-11-23T02:22:19.8407552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8408015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8408497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8408973Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8409303Z ok (3.609s) 2022-11-23T02:22:19.8409668Z test_ddp_checkpointing_once_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8410219Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62595 2022-11-23T02:22:19.8410911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8411413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8412007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8412476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8412902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8413403Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkmzaajm3 2022-11-23T02:22:19.8413951Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkmzaajm3/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8414467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8414936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8416149Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8416866Z warnings.warn( 2022-11-23T02:22:19.8417243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8417712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8418196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8418674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8419019Z ok (3.610s) 2022-11-23T02:22:19.8419380Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8420096Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62632 2022-11-23T02:22:19.8420791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8421226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8421807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8422277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8422776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8423261Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaq8juyer 2022-11-23T02:22:19.8423801Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaq8juyer/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8424323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8424791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8425146Z ok (3.510s) 2022-11-23T02:22:19.8425524Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8426240Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62669 2022-11-23T02:22:19.8426927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8427381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8427958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8428481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8428917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8429419Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp030i16ew 2022-11-23T02:22:19.8429956Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp030i16ew/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8430452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8430936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8431295Z ok (3.509s) 2022-11-23T02:22:19.8431640Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8432361Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62706 2022-11-23T02:22:19.8433074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8433531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8434092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8434563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8435001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8435683Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi3fbvvbe 2022-11-23T02:22:19.8436211Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi3fbvvbe/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8436729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8437760Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:22:19.8438733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8439183Z ok (3.609s) 2022-11-23T02:22:19.8439542Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8440263Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62743 2022-11-23T02:22:19.8440979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8441415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8441998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8442472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8442912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8443395Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbdwnabz9 2022-11-23T02:22:19.8443941Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbdwnabz9/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8444326Z ok (3.509s) 2022-11-23T02:22:19.8444663Z test_ddp_checkpointing_twice_weight_sharing (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8445300Z Checkpointing should work with static graph in the case of checkpointing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62780 2022-11-23T02:22:19.8446024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8446473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8447037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8447504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8447949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8448435Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv2nil23e 2022-11-23T02:22:19.8448976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv2nil23e/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8449496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8449985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8450321Z ok (3.508s) 2022-11-23T02:22:19.8450693Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8451272Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62817 2022-11-23T02:22:19.8451989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8452431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8453012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8453482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8453907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8454407Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp819qfmvq 2022-11-23T02:22:19.8454948Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp819qfmvq/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8456005Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:22:19.8457742Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8458463Z warnings.warn( 2022-11-23T02:22:19.8458824Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8459309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8459663Z ok (3.609s) 2022-11-23T02:22:19.8460014Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8460589Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62854 2022-11-23T02:22:19.8461347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8461810Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8462375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8462843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8463281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8463766Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptdn502ue 2022-11-23T02:22:19.8464309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptdn502ue/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8465511Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:22:19.8466231Z warnings.warn( 2022-11-23T02:22:19.8466590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8467075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8467429Z ok (3.609s) 2022-11-23T02:22:19.8467798Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8468339Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62891 2022-11-23T02:22:19.8469018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8469470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8470054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8470512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8470949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8471451Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp82itt6qn 2022-11-23T02:22:19.8471973Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp82itt6qn/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8472555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8473039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8473392Z ok (3.509s) 2022-11-23T02:22:19.8473745Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-11-23T02:22:19.8474305Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62928 2022-11-23T02:22:19.8474989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8475652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8476236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8476705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8477151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8477636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0sbvdaw6 2022-11-23T02:22:19.8478172Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0sbvdaw6/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8478763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8479248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8479728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8480205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8480555Z ok (3.609s) 2022-11-23T02:22:19.8480983Z test_ddp_invoke_work_object (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62965 2022-11-23T02:22:19.8481695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8482146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8482727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8483183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8483628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8484127Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1cw1ix95 2022-11-23T02:22:19.8484648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1cw1ix95/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8485029Z ok (2.206s) 2022-11-23T02:22:19.8485466Z test_ddp_with_pypg (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63001 2022-11-23T02:22:19.8486169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8486605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8487189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8487666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8488091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8488595Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3p8hgdkv 2022-11-23T02:22:19.8489134Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3p8hgdkv/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8489648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8490084Z ok (2.207s) 2022-11-23T02:22:19.8490536Z test_ddp_with_pypg_with_grad_views (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63037 2022-11-23T02:22:19.8491242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8491700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8492268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8492736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8493175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8493658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmwxqzb_j 2022-11-23T02:22:19.8494203Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmwxqzb_j/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8494721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8495072Z ok (2.206s) 2022-11-23T02:22:19.8495558Z test_invalid_powerSGD_state (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63073 2022-11-23T02:22:19.8496274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8496723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8497283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8497752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8498191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8498989Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8500070Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8501120Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8502189Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8503249Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8504314Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:22:19.8505010Z ok (2.206s) 2022-11-23T02:22:19.8505463Z test_sync_batch_norm_empty_input (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63107 2022-11-23T02:22:19.8506161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8506615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8507192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8507661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8508086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8508593Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwk4ta0mx 2022-11-23T02:22:19.8509136Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwk4ta0mx/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8509634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8509987Z ok (3.008s) 2022-11-23T02:22:19.8510493Z test_sync_batch_norm_only_empty_input (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63144 2022-11-23T02:22:19.8511216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:22:19.8511653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:22:19.8512232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:22:19.8512709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:22:19.8513148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:22:19.8513636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn37q58hh 2022-11-23T02:22:19.8514183Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn37q58hh/_remote_module_non_scriptable.py 2022-11-23T02:22:19.8514698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:22:19.8515196Z ok (2.908s) 2022-11-23T02:22:19.8515356Z 2022-11-23T02:22:19.8515632Z ---------------------------------------------------------------------- 2022-11-23T02:22:19.8515997Z Ran 38 tests in 123.811s 2022-11-23T02:22:19.8516163Z 2022-11-23T02:22:19.8516257Z OK 2022-11-23T02:22:19.8516375Z 2022-11-23T02:22:19.8516499Z Generating XML reports... 2022-11-23T02:22:19.8517106Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkSubclass-20221123022015.xml 2022-11-23T02:22:19.8517895Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkWrapper-20221123022015.xml 2022-11-23T02:22:19.8518248Z 2022-11-23T02:22:19.8518629Z ##[endgroup] 2022-11-23T02:22:19.8519201Z FINISHED PRINTING LOG FILE of distributed/test_c10d_pypg (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_pypg_7o8u25rr) 2022-11-23T02:22:19.8519534Z 2022-11-23T02:22:19.8519832Z Running distributed/fsdp/test_fsdp_summon_full_params ... [2022-11-23 02:22:19.825785] 2022-11-23T02:22:19.8520556Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_summon_full_params.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:22:19.826051] 2022-11-23T02:25:23.3061020Z 2022-11-23T02:25:23.3063547Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_summon_full_params 2022-11-23T02:25:23.3064521Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_summon_full_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_summon_full_params_46s8rm4e) 2022-11-23T02:25:23.3065182Z 2022-11-23T02:25:23.3065297Z Running tests... 2022-11-23T02:25:23.3067944Z ---------------------------------------------------------------------- 2022-11-23T02:25:23.3068898Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params 2022-11-23T02:25:23.3069496Z test_cannot_summon_full_params_from_backward (__main__.TestSummonFullParams) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:23.3070255Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63216 2022-11-23T02:25:23.3070731Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63217 2022-11-23T02:25:23.3071590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3072285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3072970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3073617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3074546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3075323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3075935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3076400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3076860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3077562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3078304Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3079301Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3080060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3080546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3082166Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3083056Z warnings.warn( 2022-11-23T02:25:23.3084199Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3084985Z warnings.warn( 2022-11-23T02:25:23.3085243Z dist init r=1, world=2 2022-11-23T02:25:23.3085495Z dist init r=0, world=2 2022-11-23T02:25:23.3085723Z ok (5.474s) 2022-11-23T02:25:23.3086183Z test_cannot_summon_full_params_from_forward (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63299 2022-11-23T02:25:23.3086858Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63300 2022-11-23T02:25:23.3087634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3088093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3088683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3089152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3089752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3090208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3090789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3091252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3091715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3092221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3092863Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3093648Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3094198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3094681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3095951Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3096728Z warnings.warn( 2022-11-23T02:25:23.3097878Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3098660Z warnings.warn( 2022-11-23T02:25:23.3098909Z dist init r=0, world=2 2022-11-23T02:25:23.3099166Z dist init r=1, world=2 2022-11-23T02:25:23.3099392Z ok (3.309s) 2022-11-23T02:25:23.3099754Z test_named_parameters_buffers_prefix__recurse_False (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3100313Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63378 2022-11-23T02:25:23.3100823Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63379 2022-11-23T02:25:23.3101451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3101905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3102487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3102944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3103527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3104048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3104609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3105080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3105537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3106039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3106684Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3107376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3107900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3108380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3108721Z dist init r=0, world=2 2022-11-23T02:25:23.3108976Z dist init r=1, world=2 2022-11-23T02:25:23.3109215Z ok (3.310s) 2022-11-23T02:25:23.3109555Z test_named_parameters_buffers_prefix__recurse_True (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3110158Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63457 2022-11-23T02:25:23.3110698Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63458 2022-11-23T02:25:23.3111312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3111748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3112330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3112806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3113369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3113815Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3114392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3114860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3115632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3116128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3116788Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3117485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3117993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3118469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3118830Z dist init r=1, world=2 2022-11-23T02:25:23.3119069Z dist init r=0, world=2 2022-11-23T02:25:23.3119311Z ok (3.310s) 2022-11-23T02:25:23.3119685Z test_named_parameters_buffers_prefix_test_prefix_recurse_False (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3120228Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63536 2022-11-23T02:25:23.3120754Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63537 2022-11-23T02:25:23.3121370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3121924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3122526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3122998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3123585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3124032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3124593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3125063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3125517Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3126001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3126658Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3127354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3127955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3128426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3128779Z dist init r=0, world=2 2022-11-23T02:25:23.3129036Z dist init r=1, world=2 2022-11-23T02:25:23.3129261Z ok (3.310s) 2022-11-23T02:25:23.3129634Z test_named_parameters_buffers_prefix_test_prefix_recurse_True (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3130193Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63615 2022-11-23T02:25:23.3130726Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63616 2022-11-23T02:25:23.3131330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3131784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3132368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3132827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3133415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3133861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3134437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3134894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3135352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3135852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3136517Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3137195Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3137720Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3138193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3138535Z dist init r=1, world=2 2022-11-23T02:25:23.3138860Z dist init r=0, world=2 2022-11-23T02:25:23.3139105Z ok (3.310s) 2022-11-23T02:25:23.3139614Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63694 2022-11-23T02:25:23.3140224Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63695 2022-11-23T02:25:23.3140843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3141301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3141863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3142340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3142926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3143386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3143945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3144416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3144929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3145426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3146089Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3146781Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3147303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3147765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3148117Z dist init r=1, world=2 2022-11-23T02:25:23.3148371Z dist init r=0, world=2 2022-11-23T02:25:23.3148594Z ok (3.310s) 2022-11-23T02:25:23.3149115Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63773 2022-11-23T02:25:23.3149722Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63774 2022-11-23T02:25:23.3150343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3150777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3151354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3151827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3152405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3152833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3153410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3153877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3154314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3154811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3155840Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3156635Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3157143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3157615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3157974Z dist init r=1, world=2 2022-11-23T02:25:23.3158214Z dist init r=0, world=2 2022-11-23T02:25:23.3158459Z ok (3.210s) 2022-11-23T02:25:23.3158978Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63852 2022-11-23T02:25:23.3159586Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63853 2022-11-23T02:25:23.3160187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3160641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3161222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3161693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3162335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3162795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3163373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3163820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3164278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3164775Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3165444Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3166122Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3182559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3183111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3184259Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3184945Z warnings.warn( 2022-11-23T02:25:23.3185919Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3186573Z warnings.warn( 2022-11-23T02:25:23.3187533Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3188192Z warnings.warn( 2022-11-23T02:25:23.3189148Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3189942Z warnings.warn( 2022-11-23T02:25:23.3190176Z dist init r=0, world=2 2022-11-23T02:25:23.3190427Z dist init r=1, world=2 2022-11-23T02:25:23.3190666Z ok (3.310s) 2022-11-23T02:25:23.3191188Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63931 2022-11-23T02:25:23.3191776Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63932 2022-11-23T02:25:23.3192394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3192847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3193417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3193894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3194478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3194984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3195951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3196421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3196879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3197384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3198033Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3198737Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3199264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3199729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3200817Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3201486Z warnings.warn( 2022-11-23T02:25:23.3202461Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3203128Z warnings.warn( 2022-11-23T02:25:23.3204073Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3204720Z warnings.warn( 2022-11-23T02:25:23.3205672Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3206431Z warnings.warn( 2022-11-23T02:25:23.3206680Z dist init r=0, world=2 2022-11-23T02:25:23.3206914Z dist init r=1, world=2 2022-11-23T02:25:23.3207154Z ok (3.310s) 2022-11-23T02:25:23.3207676Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64010 2022-11-23T02:25:23.3208266Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64011 2022-11-23T02:25:23.3208885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3209340Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3209919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3210376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3210958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3211404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3212047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3212528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3212984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3213487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3214134Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3214831Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3215353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3215830Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3216177Z dist init r=1, world=2 2022-11-23T02:25:23.3216429Z dist init r=0, world=2 2022-11-23T02:25:23.3216669Z ok (3.310s) 2022-11-23T02:25:23.3217170Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64089 2022-11-23T02:25:23.3217777Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64090 2022-11-23T02:25:23.3218393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3218837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3219413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3219883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3220465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3220897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3221467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3221934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3222422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3223031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3223694Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3224386Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3224900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3225377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3225732Z dist init r=0, world=2 2022-11-23T02:25:23.3225986Z dist init r=1, world=2 2022-11-23T02:25:23.3226211Z ok (3.310s) 2022-11-23T02:25:23.3226726Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64168 2022-11-23T02:25:23.3227328Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64169 2022-11-23T02:25:23.3227933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3228387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3229021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3229503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3230072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3230526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3231097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3231569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3232007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3232506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3233168Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3233837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3234357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3234836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3235445Z dist init r=1, world=2 2022-11-23T02:25:23.3235686Z dist init r=0, world=2 2022-11-23T02:25:23.3235927Z ok (3.310s) 2022-11-23T02:25:23.3236446Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64247 2022-11-23T02:25:23.3237031Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64248 2022-11-23T02:25:23.3237657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3238109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3238982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3239440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3240019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3240468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3241150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3241598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3242053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3242563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3243210Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3243905Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3244429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3244905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3245252Z dist init r=1, world=2 2022-11-23T02:25:23.3245507Z dist init r=0, world=2 2022-11-23T02:25:23.3245749Z ok (3.310s) 2022-11-23T02:25:23.3246316Z test_params_count_and_value_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64326 2022-11-23T02:25:23.3246931Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64327 2022-11-23T02:25:23.3247546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3247997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3248557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3249025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3249615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3250063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3250617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3251086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3251541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3252030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3252693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3253383Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3253914Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3254370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3254720Z dist init r=0, world=2 2022-11-23T02:25:23.3254974Z dist init r=1, world=2 2022-11-23T02:25:23.3255198Z ok (3.310s) 2022-11-23T02:25:23.3255714Z test_params_count_and_value_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64405 2022-11-23T02:25:23.3256311Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64406 2022-11-23T02:25:23.3256929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3257366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3258023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3258494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3259059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3259509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3260088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3260556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3260994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3261495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3262156Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3262854Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3263358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3263888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3264255Z dist init r=0, world=2 2022-11-23T02:25:23.3264492Z dist init r=1, world=2 2022-11-23T02:25:23.3264736Z ok (3.310s) 2022-11-23T02:25:23.3265249Z test_params_count_and_value_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64484 2022-11-23T02:25:23.3265850Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64485 2022-11-23T02:25:23.3266458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3266911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3267489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3267948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3268534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3268982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3269555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3270004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3270455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3270963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3271625Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3272308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3272832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3273303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3273645Z dist init r=0, world=2 2022-11-23T02:25:23.3273897Z dist init r=1, world=2 2022-11-23T02:25:23.3274140Z ok (3.310s) 2022-11-23T02:25:23.3274656Z test_params_count_and_value_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64563 2022-11-23T02:25:23.3275761Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64564 2022-11-23T02:25:23.3276387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3276841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3277408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3277881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3278470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3278918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3279478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3279954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3280412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3280897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3281641Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3282352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3282877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3283335Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3283692Z dist init r=0, world=2 2022-11-23T02:25:23.3283952Z dist init r=1, world=2 2022-11-23T02:25:23.3284177Z ok (3.310s) 2022-11-23T02:25:23.3284687Z test_params_count_and_value_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64642 2022-11-23T02:25:23.3285291Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64643 2022-11-23T02:25:23.3285908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3286345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3286921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3287396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3287979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3288415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3288988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3289454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3289892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3290392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3291048Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3291740Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3292249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3292810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3293173Z dist init r=0, world=2 2022-11-23T02:25:23.3293428Z dist init r=1, world=2 2022-11-23T02:25:23.3293653Z ok (3.310s) 2022-11-23T02:25:23.3294174Z test_params_count_and_value_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64721 2022-11-23T02:25:23.3294775Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64722 2022-11-23T02:25:23.3295378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3295832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3296412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3296888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3297450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3297897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3298526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3298985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3299439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3299942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3300603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3301289Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3301811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3302283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3302644Z dist init r=1, world=2 2022-11-23T02:25:23.3302885Z dist init r=0, world=2 2022-11-23T02:25:23.3303128Z ok (3.311s) 2022-11-23T02:25:23.3303640Z test_params_count_and_value_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64800 2022-11-23T02:25:23.3304219Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64801 2022-11-23T02:25:23.3304833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3305290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3305871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3306322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3306912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3307365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3307922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3308387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3308843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3309345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3310072Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3310759Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3311289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3311767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3312108Z dist init r=1, world=2 2022-11-23T02:25:23.3312363Z dist init r=0, world=2 2022-11-23T02:25:23.3312603Z ok (3.311s) 2022-11-23T02:25:23.3313095Z test_params_count_and_value_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64879 2022-11-23T02:25:23.3313692Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64880 2022-11-23T02:25:23.3314316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3314767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3315674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3316161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3316748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3317178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3317751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3318212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3318673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3319157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3319818Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3320514Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3321023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3321496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3321850Z dist init r=1, world=2 2022-11-23T02:25:23.3322103Z dist init r=0, world=2 2022-11-23T02:25:23.3322327Z ok (3.410s) 2022-11-23T02:25:23.3322689Z test_raises_rank0_with_writeback (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3323210Z Tests that ``summon_full_params()`` with both ``rank0_only=True`` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64958 2022-11-23T02:25:23.3323717Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64959 2022-11-23T02:25:23.3324337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3324788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3325368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3325819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3326401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3326849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3327507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3327973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3328429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3328938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3329584Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3330270Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3330790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3331262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3331605Z dist init r=1, world=2 2022-11-23T02:25:23.3331858Z dist init r=0, world=2 2022-11-23T02:25:23.3332099Z ok (3.309s) 2022-11-23T02:25:23.3332685Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65037 2022-11-23T02:25:23.3333329Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65038 2022-11-23T02:25:23.3333946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3334400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3334964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3335434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3336022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3336452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3337025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3337494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3337953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3338437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3339098Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3339789Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3340313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3340770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3342037Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3342810Z warnings.warn( 2022-11-23T02:25:23.3343966Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3344810Z warnings.warn( 2022-11-23T02:25:23.3345578Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3346163Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3346975Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3347548Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3347820Z dist init r=0, world=2 2022-11-23T02:25:23.3348080Z dist init r=1, world=2 2022-11-23T02:25:23.3348320Z ok (3.712s) 2022-11-23T02:25:23.3348844Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65120 2022-11-23T02:25:23.3349534Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65121 2022-11-23T02:25:23.3350164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3350617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3351179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3351650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3352238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3352685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3353241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3353711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3354167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3354652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3355552Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3356253Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3356785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3357243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3358509Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3359290Z warnings.warn( 2022-11-23T02:25:23.3360437Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3361292Z warnings.warn( 2022-11-23T02:25:23.3362059Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3362644Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3363453Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3364028Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3364301Z dist init r=0, world=2 2022-11-23T02:25:23.3364554Z dist init r=1, world=2 2022-11-23T02:25:23.3364798Z ok (3.812s) 2022-11-23T02:25:23.3365341Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65203 2022-11-23T02:25:23.3366021Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65204 2022-11-23T02:25:23.3366646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3367152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3367717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3368185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3368764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3369216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3369776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3370247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3370701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3371204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3371849Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3372543Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3373076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3373535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3374799Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3375576Z warnings.warn( 2022-11-23T02:25:23.3376723Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3377572Z warnings.warn( 2022-11-23T02:25:23.3378353Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3378922Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3379729Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3380306Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3381314Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3381967Z warnings.warn( 2022-11-23T02:25:23.3382986Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3383647Z warnings.warn( 2022-11-23T02:25:23.3384619Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3385287Z warnings.warn( 2022-11-23T02:25:23.3386236Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3386883Z warnings.warn( 2022-11-23T02:25:23.3387136Z dist init r=1, world=2 2022-11-23T02:25:23.3387388Z dist init r=0, world=2 2022-11-23T02:25:23.3387610Z ok (3.812s) 2022-11-23T02:25:23.3388150Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65286 2022-11-23T02:25:23.3388778Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65287 2022-11-23T02:25:23.3389377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3389831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3390419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3390890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3391457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3391907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3392483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3393000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3393457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3393958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3394626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3395479Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3396014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3396491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3397761Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3398533Z warnings.warn( 2022-11-23T02:25:23.3399737Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3400518Z warnings.warn( 2022-11-23T02:25:23.3401296Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3401884Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3402681Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3403263Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3404271Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3404939Z warnings.warn( 2022-11-23T02:25:23.3405903Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3406534Z warnings.warn( 2022-11-23T02:25:23.3407503Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3408162Z warnings.warn( 2022-11-23T02:25:23.3409120Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3409857Z warnings.warn( 2022-11-23T02:25:23.3410091Z dist init r=1, world=2 2022-11-23T02:25:23.3410344Z dist init r=0, world=2 2022-11-23T02:25:23.3410589Z ok (3.812s) 2022-11-23T02:25:23.3411113Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65369 2022-11-23T02:25:23.3411742Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65370 2022-11-23T02:25:23.3412359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3412811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3413381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3413855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3414496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3414938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3415512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3415976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3416434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3416923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3417590Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3418290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3418820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3419278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3420540Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3421325Z warnings.warn( 2022-11-23T02:25:23.3422485Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3423295Z warnings.warn( 2022-11-23T02:25:23.3424061Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3424636Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3425444Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3426101Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3426370Z dist init r=1, world=2 2022-11-23T02:25:23.3426626Z dist init r=0, world=2 2022-11-23T02:25:23.3426872Z ok (3.812s) 2022-11-23T02:25:23.3427399Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65452 2022-11-23T02:25:23.3428026Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65453 2022-11-23T02:25:23.3428649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3429105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3429671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3430144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3430728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3431233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3431802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3432270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3432725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3433211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3433883Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3434578Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3435403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3435874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3437137Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3437920Z warnings.warn( 2022-11-23T02:25:23.3439062Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3439829Z warnings.warn( 2022-11-23T02:25:23.3440585Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3441168Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3441975Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3442642Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3442911Z dist init r=1, world=2 2022-11-23T02:25:23.3443166Z dist init r=0, world=2 2022-11-23T02:25:23.3443408Z ok (3.812s) 2022-11-23T02:25:23.3443935Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65535 2022-11-23T02:25:23.3444567Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65536 2022-11-23T02:25:23.3445186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3445639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3446210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3446686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3447273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3447796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3448368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3448839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3449297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3449788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3450452Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3451154Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3451680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3452142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3453409Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3454184Z warnings.warn( 2022-11-23T02:25:23.3455319Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3456088Z warnings.warn( 2022-11-23T02:25:23.3456864Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3457435Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3458246Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3458885Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3459169Z dist init r=1, world=2 2022-11-23T02:25:23.3459405Z dist init r=0, world=2 2022-11-23T02:25:23.3459646Z ok (3.812s) 2022-11-23T02:25:23.3460188Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65618 2022-11-23T02:25:23.3460792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65619 2022-11-23T02:25:23.3461406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3461859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3462439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3462898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3463485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3463986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3464560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3465027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3465484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3465986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3466632Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3467332Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3467858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3468339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3469593Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3470357Z warnings.warn( 2022-11-23T02:25:23.3471498Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3472268Z warnings.warn( 2022-11-23T02:25:23.3473044Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3473608Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3474411Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3475209Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3475502Z dist init r=1, world=2 2022-11-23T02:25:23.3475735Z dist init r=0, world=2 2022-11-23T02:25:23.3475976Z ok (3.812s) 2022-11-23T02:25:23.3476416Z test_summon_from_non_fsdp (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65701 2022-11-23T02:25:23.3476930Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65702 2022-11-23T02:25:23.3477550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3478003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3478588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3479051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3479632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3480076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3480712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3481195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3481647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3482147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3482793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3483486Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3484016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3484496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3484843Z dist init r=0, world=2 2022-11-23T02:25:23.3485103Z dist init r=1, world=2 2022-11-23T02:25:23.3485347Z ok (3.311s) 2022-11-23T02:25:23.3485855Z test_summon_full_param_recursive_recurse_False_summon_outer_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65780 2022-11-23T02:25:23.3486462Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65781 2022-11-23T02:25:23.3487078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3487532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3488097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3488565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3489148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3489580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3490157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3490619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3491071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3491560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3492302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3492996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3493525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3493985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3495244Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3496024Z warnings.warn( 2022-11-23T02:25:23.3497223Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3497987Z warnings.warn( 2022-11-23T02:25:23.3498220Z dist init r=1, world=2 2022-11-23T02:25:23.3498468Z dist init r=0, world=2 2022-11-23T02:25:23.3498704Z ok (3.210s) 2022-11-23T02:25:23.3499202Z test_summon_full_param_recursive_recurse_False_summon_outer_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65859 2022-11-23T02:25:23.3499813Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65860 2022-11-23T02:25:23.3500431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3500884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3501457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3501930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3502512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3502942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3503519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3503979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3504435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3504919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3505583Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3506281Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3506803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3507259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3508514Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3509347Z warnings.warn( 2022-11-23T02:25:23.3510491Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3511253Z warnings.warn( 2022-11-23T02:25:23.3511486Z dist init r=1, world=2 2022-11-23T02:25:23.3511727Z dist init r=0, world=2 2022-11-23T02:25:23.3511964Z ok (3.310s) 2022-11-23T02:25:23.3512464Z test_summon_full_param_recursive_recurse_False_summon_outer_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65938 2022-11-23T02:25:23.3513061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65939 2022-11-23T02:25:23.3513734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3514200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3514763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3515450Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3516040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3516477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3517053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3517521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3517983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3518468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3519125Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3519820Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3520349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3520812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3522071Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3522888Z warnings.warn( 2022-11-23T02:25:23.3524034Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3524901Z warnings.warn( 2022-11-23T02:25:23.3525134Z dist init r=1, world=2 2022-11-23T02:25:23.3525388Z dist init r=0, world=2 2022-11-23T02:25:23.3525629Z ok (3.210s) 2022-11-23T02:25:23.3526131Z test_summon_full_param_recursive_recurse_False_summon_outer_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66017 2022-11-23T02:25:23.3526736Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66018 2022-11-23T02:25:23.3527353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3527809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3528376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3528853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3529436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3529867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3530560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3531041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3531495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3531983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3532644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3533342Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3533867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3534329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3535589Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3536369Z warnings.warn( 2022-11-23T02:25:23.3537511Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3538287Z warnings.warn( 2022-11-23T02:25:23.3538524Z dist init r=0, world=2 2022-11-23T02:25:23.3538776Z dist init r=1, world=2 2022-11-23T02:25:23.3539017Z ok (3.310s) 2022-11-23T02:25:23.3539514Z test_summon_full_param_recursive_recurse_True_summon_outer_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66096 2022-11-23T02:25:23.3540114Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66097 2022-11-23T02:25:23.3540735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3541273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3541841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3542313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3542899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3543331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3543907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3544374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3544831Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3545324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3545980Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3546671Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3547255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3547726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3548992Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3549768Z warnings.warn( 2022-11-23T02:25:23.3550909Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3551680Z warnings.warn( 2022-11-23T02:25:23.3551915Z dist init r=1, world=2 2022-11-23T02:25:23.3552166Z dist init r=0, world=2 2022-11-23T02:25:23.3552406Z ok (3.211s) 2022-11-23T02:25:23.3552906Z test_summon_full_param_recursive_recurse_True_summon_outer_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66175 2022-11-23T02:25:23.3553520Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66176 2022-11-23T02:25:23.3554137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3554590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3555325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3555803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3556390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3556818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3557390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3557950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3558403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3558893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3559557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3560247Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3560770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3561229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3562486Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3563276Z warnings.warn( 2022-11-23T02:25:23.3564489Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3565253Z warnings.warn( 2022-11-23T02:25:23.3565486Z dist init r=1, world=2 2022-11-23T02:25:23.3565745Z dist init r=0, world=2 2022-11-23T02:25:23.3565987Z ok (3.210s) 2022-11-23T02:25:23.3566488Z test_summon_full_param_recursive_recurse_True_summon_outer_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66254 2022-11-23T02:25:23.3567095Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66255 2022-11-23T02:25:23.3567721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3568174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3568738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3569211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3569792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3570242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3570802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3571269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3571729Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3572219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3572877Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3573573Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3574102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3574627Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3575890Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3576666Z warnings.warn( 2022-11-23T02:25:23.3577808Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3578580Z warnings.warn( 2022-11-23T02:25:23.3578815Z dist init r=1, world=2 2022-11-23T02:25:23.3579064Z dist init r=0, world=2 2022-11-23T02:25:23.3579304Z ok (3.310s) 2022-11-23T02:25:23.3579851Z test_summon_full_param_recursive_recurse_True_summon_outer_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66333 2022-11-23T02:25:23.3580466Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66334 2022-11-23T02:25:23.3581085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3581534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3582099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3582566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3583149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3583594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3584155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3584615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3585068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3585557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3586216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3586915Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3587438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3587896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3589152Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3589934Z warnings.warn( 2022-11-23T02:25:23.3591078Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3591918Z warnings.warn( 2022-11-23T02:25:23.3592157Z dist init r=0, world=2 2022-11-23T02:25:23.3592407Z dist init r=1, world=2 2022-11-23T02:25:23.3592645Z ok (3.310s) 2022-11-23T02:25:23.3593105Z test_summon_full_param_shard_value_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66412 2022-11-23T02:25:23.3593671Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66413 2022-11-23T02:25:23.3594290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3594748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3595538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3596008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3596675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3597133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3597697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3598164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3598621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3599113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3599770Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3600462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3600990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3601447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3601798Z dist init r=0, world=2 2022-11-23T02:25:23.3602052Z dist init r=1, world=2 2022-11-23T02:25:23.3602273Z ok (3.310s) 2022-11-23T02:25:23.3602745Z test_summon_full_param_shard_value_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66491 2022-11-23T02:25:23.3603318Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66492 2022-11-23T02:25:23.3603940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3604378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3604961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3605434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3606002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3606449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3607023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3607490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3608021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3608521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3609188Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3609889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3610399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3610871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3611232Z dist init r=0, world=2 2022-11-23T02:25:23.3611468Z dist init r=1, world=2 2022-11-23T02:25:23.3611706Z ok (3.310s) 2022-11-23T02:25:23.3612156Z test_summon_full_param_writeback (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66570 2022-11-23T02:25:23.3612696Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66571 2022-11-23T02:25:23.3613295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3613809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3614402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3614859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3615446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3615896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3616480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3616930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3617386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3617890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3618538Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3619231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3619759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3620234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3620580Z dist init r=1, world=2 2022-11-23T02:25:23.3620830Z dist init r=0, world=2 2022-11-23T02:25:23.3621072Z ok (3.409s) 2022-11-23T02:25:23.3621556Z test_summon_full_params_equivalence_rank0_only_False_offload_to_cpu_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66649 2022-11-23T02:25:23.3622147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66650 2022-11-23T02:25:23.3622802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3623257Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3623811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3624263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3624839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3625385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3625965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3626435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3626888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3627376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3628039Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3628733Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3629262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3629716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3630071Z dist init r=0, world=2 2022-11-23T02:25:23.3630325Z dist init r=1, world=2 2022-11-23T02:25:23.3630550Z ok (3.310s) 2022-11-23T02:25:23.3631105Z test_summon_full_params_equivalence_rank0_only_False_offload_to_cpu_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66728 2022-11-23T02:25:23.3631701Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66729 2022-11-23T02:25:23.3632316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3632749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3633331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3633810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3634376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3634819Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3635623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3636091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3636527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3637025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3637681Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3638378Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3638885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3639367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3640460Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3641136Z warnings.warn( 2022-11-23T02:25:23.3642088Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:818: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3642843Z warnings.warn( 2022-11-23T02:25:23.3643812Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3644464Z warnings.warn( 2022-11-23T02:25:23.3645425Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_unshard_param_utils.py:147: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-11-23T02:25:23.3646061Z warnings.warn( 2022-11-23T02:25:23.3646312Z dist init r=0, world=2 2022-11-23T02:25:23.3646566Z dist init r=1, world=2 2022-11-23T02:25:23.3646789Z ok (3.310s) 2022-11-23T02:25:23.3647357Z test_summon_full_params_equivalence_rank0_only_True_offload_to_cpu_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66807 2022-11-23T02:25:23.3647956Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66808 2022-11-23T02:25:23.3648578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3649016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3649594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3650066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3650655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3651087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3651667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3652133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3652573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3653078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3653740Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3654435Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3654948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3655179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3655295Z dist init r=0, world=2 2022-11-23T02:25:23.3655406Z dist init r=1, world=2 2022-11-23T02:25:23.3655511Z ok (3.310s) 2022-11-23T02:25:23.3655882Z test_summon_full_params_equivalence_rank0_only_True_offload_to_cpu_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66886 2022-11-23T02:25:23.3656101Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66887 2022-11-23T02:25:23.3656465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3656643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3657113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3657306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3657674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3657854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3658232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3658422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3658672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3658899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3659306Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3659705Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3659937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3660219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3660342Z dist init r=1, world=2 2022-11-23T02:25:23.3660452Z dist init r=0, world=2 2022-11-23T02:25:23.3660555Z ok (3.310s) 2022-11-23T02:25:23.3660891Z test_summon_full_params_respects_reshard_after_forward (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66965 2022-11-23T02:25:23.3661111Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66966 2022-11-23T02:25:23.3661489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3661672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3662058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3662256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3662624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3662800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3663186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3663362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3663607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3663856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3664260Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3664662Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3664893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3665122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3666141Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3666318Z warnings.warn( 2022-11-23T02:25:23.3667375Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3667490Z warnings.warn( 2022-11-23T02:25:23.3668114Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3668266Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3668902Z /opt/conda/lib/python3.10/site-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2022-11-23T02:25:23.3669100Z warnings.warn(message, UserWarning) 2022-11-23T02:25:23.3669222Z dist init r=0, world=2 2022-11-23T02:25:23.3669333Z dist init r=1, world=2 2022-11-23T02:25:23.3669434Z ok (3.810s) 2022-11-23T02:25:23.3669746Z test_summon_single_param (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67044 2022-11-23T02:25:23.3669970Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67045 2022-11-23T02:25:23.3670333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3670514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3670897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3671089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3671461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3671639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3672013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3672201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3672433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3672681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3673094Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3673496Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3673731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3673960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3674974Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3675365Z warnings.warn( 2022-11-23T02:25:23.3676377Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:25:23.3676490Z warnings.warn( 2022-11-23T02:25:23.3676602Z dist init r=0, world=2 2022-11-23T02:25:23.3676696Z dist init r=1, world=2 2022-11-23T02:25:23.3676797Z ok (3.310s) 2022-11-23T02:25:23.3676975Z test_with_grads_core (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3677283Z Tests the core usage of ``summon_full_params(with_grads=True)``. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67123 2022-11-23T02:25:23.3677507Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67124 2022-11-23T02:25:23.3677885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3678064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3678529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3678717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3679097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3679273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3679648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3679843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3680092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3680339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3680748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3681150Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3681365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3681594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3681833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3682076Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3682304Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3682538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3682774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3683007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3683220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3683445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3683675Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3683907Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3684224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3684449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3684674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3684908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3685138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3685352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3685579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3685803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3686030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3686261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3686491Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3686718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3686991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3687209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:25:23.3687325Z dist init r=0, world=2 2022-11-23T02:25:23.3687439Z dist init r=1, world=2 2022-11-23T02:25:23.3687541Z ok (6.015s) 2022-11-23T02:25:23.3687734Z test_with_grads_none_grads (__main__.TestSummonFullParams) 2022-11-23T02:25:23.3688187Z Tests that if all ranks' ``FlatParameter`` has ``None`` gradient, then ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67206 2022-11-23T02:25:23.3688413Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67207 2022-11-23T02:25:23.3688791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3688953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3689341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3689536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3689902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3690078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3690457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3690651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3690900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:23.3691126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3691536Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3691936Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:23.3692169Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:23.3692401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3692516Z dist init r=0, world=2 2022-11-23T02:25:23.3692626Z dist init r=1, world=2 2022-11-23T02:25:23.3692790Z ok (3.410s) 2022-11-23T02:25:23.3693116Z test_summon_full_param_writeback (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67285 2022-11-23T02:25:23.3693495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:23.3693676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:23.3694057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:23.3694252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:23.3694500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:23.3694899Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:25:23.3695128Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:23.3695244Z dist init r=0, world=1 2022-11-23T02:25:23.3695329Z ok (3.208s) 2022-11-23T02:25:23.3695352Z 2022-11-23T02:25:23.3695625Z ---------------------------------------------------------------------- 2022-11-23T02:25:23.3695744Z Ran 52 tests in 181.113s 2022-11-23T02:25:23.3695763Z 2022-11-23T02:25:23.3695858Z OK 2022-11-23T02:25:23.3695932Z 2022-11-23T02:25:23.3696067Z Generating XML reports... 2022-11-23T02:25:23.3696552Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParams-20221123022221.xml 2022-11-23T02:25:23.3697055Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParamsNoShard-20221123022221.xml 2022-11-23T02:25:23.3697076Z 2022-11-23T02:25:23.3697520Z ##[endgroup] 2022-11-23T02:25:23.3698023Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_summon_full_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_summon_full_params_46s8rm4e) 2022-11-23T02:25:23.3698067Z 2022-11-23T02:25:23.3698309Z Running distributed/test_c10d_gloo ... [2022-11-23 02:25:23.307542] 2022-11-23T02:25:23.3698809Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:25:23.307811] 2022-11-23T02:38:50.7201390Z 2022-11-23T02:38:50.7203899Z Expand the folded group to see the log file of distributed/test_c10d_gloo 2022-11-23T02:38:50.7204814Z ##[group]PRINTING LOG FILE of distributed/test_c10d_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_gloo_bc12qwi7) 2022-11-23T02:38:50.7205392Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpagmm4f47 2022-11-23T02:38:50.7205926Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpagmm4f47/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7207258Z , <__main__.CommTest testMethod=test_broadcast_coalesced_gloo_cuda>, <__main__.CommTest testMethod=test_gloo_barrier_device_ids>, <__main__.CommTest testMethod=test_gloo_rank_membership>, <__main__.CommTest testMethod=test_gloo_warn_not_in_group>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_default>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_subgroup>, <__main__.CommTest testMethod=test_sequence_num_set_default_pg_gloo>, <__main__.CommTest testMethod=test_sequence_num_set_gloo_new_group>, <__main__.CommTest testMethod=test_tensor_dtype_complex>, <__main__.CommTest testMethod=test_tensor_dtype_mismatch>]> 2022-11-23T02:38:50.7208795Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) 2022-11-23T02:38:50.7209136Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) 2022-11-23T02:38:50.7209485Z test_gloo_barrier_device_ids (__main__.CommTest) 2022-11-23T02:38:50.7213060Z test_gloo_rank_membership (__main__.CommTest) 2022-11-23T02:38:50.7213425Z test_gloo_warn_not_in_group (__main__.CommTest) 2022-11-23T02:38:50.7213801Z test_sequence_num_incremented_gloo_default (__main__.CommTest) 2022-11-23T02:38:50.7214191Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) 2022-11-23T02:38:50.7214542Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) 2022-11-23T02:38:50.7214906Z test_sequence_num_set_gloo_new_group (__main__.CommTest) 2022-11-23T02:38:50.7215251Z test_tensor_dtype_complex (__main__.CommTest) 2022-11-23T02:38:50.7215575Z test_tensor_dtype_mismatch (__main__.CommTest) 2022-11-23T02:38:50.7218952Z , <__main__.CompilerTest testMethod=test_allgather_work_wait_gpu>, <__main__.CompilerTest testMethod=test_allreduce_work_wait_cpu>, <__main__.CompilerTest testMethod=test_allreduce_work_wait_gpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_cpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_gpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_cpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_gpu>, <__main__.CompilerTest testMethod=test_nested_comm_tensor_wrapping>, <__main__.CompilerTest testMethod=test_scatter_work_wait_cpu>, <__main__.CompilerTest testMethod=test_scatter_work_wait_gpu>]> 2022-11-23T02:38:50.7220344Z test_allgather_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:38:50.7220970Z test_allgather_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:38:50.7221632Z test_allreduce_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:38:50.7222391Z test_allreduce_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:38:50.7222996Z test_broadcast_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:38:50.7223564Z test_broadcast_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:38:50.7224282Z test_consecutive_comm_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:38:50.7224793Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:38:50.7225173Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) 2022-11-23T02:38:50.7225517Z test_scatter_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:38:50.7225951Z test_scatter_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:38:50.7236643Z , <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_cpu>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_gpu_gloo>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_register_just_once>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_init>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_return_type>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_when_unused_parameters_empty>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_static_graph>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_integer_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_torch_device_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_2gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_4gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output_with_unused_parameters>, <__main__.DistributedDataParallelTest testMethod=test_ignored_sharded_tensor>, <__main__.DistributedDataParallelTest testMethod=test_invalid_powerSGD_state>, <__main__.DistributedDataParallelTest testMethod=test_save_load_checkpoint>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_empty_input>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_only_empty_input>]> 2022-11-23T02:38:50.7247232Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7248132Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7249072Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7249996Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7250917Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7251902Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7252957Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7253845Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7254732Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7255700Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7256722Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7257693Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7258675Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7259593Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7260416Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7261327Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7262155Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7262951Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7263776Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7266122Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7267140Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7268025Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7268949Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7269818Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7270773Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7271631Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7272402Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7273223Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7274029Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7274842Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7276154Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7276978Z test_ignored_sharded_tensor (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7277873Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7278654Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7279412Z test_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7280239Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7281037Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7281846Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7283836Z , <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_allreduce_coalesced>, <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_collectives>, <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_monitored_barrier>]> 2022-11-23T02:38:50.7285915Z test_allgather_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:38:50.7286919Z test_allreduce_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:38:50.7287920Z test_collectives (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:38:50.7288881Z test_monitored_barrier (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:38:50.7289628Z 2022-11-23T02:38:50.7299550Z , <__main__.ProcessGroupGlooTest testMethod=test_allgather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_barrier_implies_wait>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_checks>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_empty_tensors>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_gather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_gather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_multi_device_constructor>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin_create_destroy>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_checks>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_send_recv_all_to_all>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_checks>]> 2022-11-23T02:38:50.7309300Z test_allgather_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7310042Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7310746Z test_allgather_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7311403Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7312097Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7312820Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7313559Z test_allgather_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7314251Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7314929Z test_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7316057Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7316969Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7317548Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7317951Z test_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7318318Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7318716Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7319116Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7319531Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7319922Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7320305Z test_allreduce_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7320791Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7321156Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7321528Z test_broadcast_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7321904Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7322255Z test_broadcast_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7322624Z test_broadcast_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7323001Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7323371Z test_empty_tensors (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7323714Z test_gather_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7324082Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7324443Z test_gather_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7324810Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7325198Z test_gather_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7325562Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7325931Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7326302Z test_reduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7326668Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7327080Z test_reduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7327447Z test_reduce_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7327809Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7328169Z test_round_robin (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7328526Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7328904Z test_scatter_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7329272Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7329622Z test_scatter_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7329983Z test_scatter_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7330355Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7330711Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7331095Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7331497Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7331895Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:38:50.7332744Z , <__main__.ReducerTest testMethod=test_forward_backward_optimizer>, <__main__.ReducerTest testMethod=test_forward_backward_unused_parameters>, <__main__.ReducerTest testMethod=test_multi_dtype_multi_bucket>, <__main__.ReducerTest testMethod=test_multi_dtype_single_bucket>, <__main__.ReducerTest testMethod=test_single_dtype_single_bucket>]> 2022-11-23T02:38:50.7333535Z test_forward_backward (__main__.ReducerTest) 2022-11-23T02:38:50.7333886Z test_forward_backward_optimizer (__main__.ReducerTest) 2022-11-23T02:38:50.7334262Z test_forward_backward_unused_parameters (__main__.ReducerTest) 2022-11-23T02:38:50.7334609Z test_multi_dtype_multi_bucket (__main__.ReducerTest) 2022-11-23T02:38:50.7334959Z test_multi_dtype_single_bucket (__main__.ReducerTest) 2022-11-23T02:38:50.7335316Z test_single_dtype_single_bucket (__main__.ReducerTest) 2022-11-23T02:38:50.7335726Z ]> 2022-11-23T02:38:50.7336142Z test_logging_init (__main__.RendezvousEnvTest) 2022-11-23T02:38:50.7336468Z 2022-11-23T02:38:50.7336872Z ]> 2022-11-23T02:38:50.7337304Z test_default_store_timeout_gloo (__main__.TimeoutTest) 2022-11-23T02:38:50.7338001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7338524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7339093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7339576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7340049Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi3w5g39o 2022-11-23T02:38:50.7340577Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi3w5g39o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7340885Z 2022-11-23T02:38:50.7340998Z Running tests... 2022-11-23T02:38:50.7341411Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7341947Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7342423Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7342895Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67393 2022-11-23T02:38:50.7343347Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67394 2022-11-23T02:38:50.7344045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7344497Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7345083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7345555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7346120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7346568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7347152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7347621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7348067Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoi8498aq 2022-11-23T02:38:50.7348612Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoi8498aq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7349147Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc37r9f77 2022-11-23T02:38:50.7349666Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc37r9f77/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7350175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7350649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7351005Z ok (4.031s) 2022-11-23T02:38:50.7351138Z 2022-11-23T02:38:50.7351416Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7351748Z Ran 1 test in 4.031s 2022-11-23T02:38:50.7351912Z 2022-11-23T02:38:50.7352007Z OK 2022-11-23T02:38:50.7352141Z 2022-11-23T02:38:50.7352250Z Generating XML reports... 2022-11-23T02:38:50.7352848Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022527.xml 2022-11-23T02:38:50.7353518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7353972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7354538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7355008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7356072Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa0vdjvfv 2022-11-23T02:38:50.7356710Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa0vdjvfv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7357016Z 2022-11-23T02:38:50.7357125Z Running tests... 2022-11-23T02:38:50.7357547Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7358090Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7358563Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7359038Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67502 2022-11-23T02:38:50.7359490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67503 2022-11-23T02:38:50.7360087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7360551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7361135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7361607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7362241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7362704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7363286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7363753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7364202Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg5wbg9bx 2022-11-23T02:38:50.7364746Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg5wbg9bx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7365265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7365750Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa3u_6wb0 2022-11-23T02:38:50.7366283Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa3u_6wb0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7366794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7367143Z ok (4.956s) 2022-11-23T02:38:50.7367277Z 2022-11-23T02:38:50.7367556Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7367887Z Ran 1 test in 4.956s 2022-11-23T02:38:50.7368050Z 2022-11-23T02:38:50.7368144Z OK 2022-11-23T02:38:50.7368279Z 2022-11-23T02:38:50.7368387Z Generating XML reports... 2022-11-23T02:38:50.7368931Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022533.xml 2022-11-23T02:38:50.7369606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7370061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7370622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7371098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7371570Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy8wejv1o 2022-11-23T02:38:50.7372096Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy8wejv1o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7372399Z 2022-11-23T02:38:50.7372509Z Running tests... 2022-11-23T02:38:50.7372921Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7373454Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7373977Z test_gloo_barrier_device_ids (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7374437Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67613 2022-11-23T02:38:50.7374889Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67614 2022-11-23T02:38:50.7375494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7375949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7376534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7377005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7377569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7378023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7378603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7379070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7379519Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuomuyq59 2022-11-23T02:38:50.7380112Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuomuyq59/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7380659Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt1oh0_uv 2022-11-23T02:38:50.7381179Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt1oh0_uv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7381691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7382167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7382662Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7383143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7383815Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7384511Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7384908Z ok (3.924s) 2022-11-23T02:38:50.7385042Z 2022-11-23T02:38:50.7385313Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7385643Z Ran 1 test in 3.924s 2022-11-23T02:38:50.7385807Z 2022-11-23T02:38:50.7385901Z OK 2022-11-23T02:38:50.7386019Z 2022-11-23T02:38:50.7386146Z Generating XML reports... 2022-11-23T02:38:50.7386695Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022540.xml 2022-11-23T02:38:50.7387373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7387829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7388397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7388870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7389334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1dxfhq8q 2022-11-23T02:38:50.7389863Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1dxfhq8q/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7390164Z 2022-11-23T02:38:50.7390274Z Running tests... 2022-11-23T02:38:50.7390684Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7391292Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7391750Z test_gloo_rank_membership (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7392211Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67722 2022-11-23T02:38:50.7392666Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67723 2022-11-23T02:38:50.7393270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7393724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7394307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7394783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7395958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7396418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7396996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7397447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7398007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjjcov1pv 2022-11-23T02:38:50.7398568Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjjcov1pv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7399078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7399564Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp146g07gg 2022-11-23T02:38:50.7400100Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp146g07gg/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7400613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7401105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7401589Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7402265Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7402959Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7403479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:38:50.7403973Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:38:50.7404630Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7405330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7405711Z ok (3.978s) 2022-11-23T02:38:50.7405863Z 2022-11-23T02:38:50.7406130Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7406463Z Ran 1 test in 3.978s 2022-11-23T02:38:50.7406629Z 2022-11-23T02:38:50.7406725Z OK 2022-11-23T02:38:50.7406843Z 2022-11-23T02:38:50.7406967Z Generating XML reports... 2022-11-23T02:38:50.7407514Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022547.xml 2022-11-23T02:38:50.7408183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7408620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7409201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7409763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7410234Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2gy7xl_j 2022-11-23T02:38:50.7410761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2gy7xl_j/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7411067Z 2022-11-23T02:38:50.7411178Z Running tests... 2022-11-23T02:38:50.7411587Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7412099Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7412573Z test_gloo_warn_not_in_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7413034Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67834 2022-11-23T02:38:50.7413484Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67835 2022-11-23T02:38:50.7414088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7414547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7415125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7415634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7416232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7416683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7417257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7417712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7418186Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpibvnvj65 2022-11-23T02:38:50.7418728Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpibvnvj65/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7419248Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfusrxjsp 2022-11-23T02:38:50.7419791Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfusrxjsp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7420333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7420815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7421288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7421782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7422450Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7422994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:38:50.7423631Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7424164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:38:50.7424822Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7425511Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7425889Z ok (4.834s) 2022-11-23T02:38:50.7426041Z 2022-11-23T02:38:50.7426314Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7426711Z Ran 1 test in 4.834s 2022-11-23T02:38:50.7426874Z 2022-11-23T02:38:50.7426952Z OK 2022-11-23T02:38:50.7427085Z 2022-11-23T02:38:50.7427211Z Generating XML reports... 2022-11-23T02:38:50.7427757Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022553.xml 2022-11-23T02:38:50.7428434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7428870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7429453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7429927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7430379Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8ubklbas 2022-11-23T02:38:50.7430920Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8ubklbas/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7431225Z 2022-11-23T02:38:50.7431335Z Running tests... 2022-11-23T02:38:50.7431765Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7432282Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7432835Z test_sequence_num_incremented_gloo_default (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7433329Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67948 2022-11-23T02:38:50.7433764Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67949 2022-11-23T02:38:50.7434376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7434827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7436043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7436850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7437716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7438188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7438757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7439230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7439697Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzh8qwe27 2022-11-23T02:38:50.7440247Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzh8qwe27/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7440742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7441250Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd8k9f0qx 2022-11-23T02:38:50.7441788Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd8k9f0qx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7442297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7442774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7443273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7443938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7444610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7445144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:38:50.7445754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:38:50.7446411Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7447086Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7447482Z ok (4.880s) 2022-11-23T02:38:50.7447634Z 2022-11-23T02:38:50.7447903Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7448238Z Ran 1 test in 4.880s 2022-11-23T02:38:50.7448383Z 2022-11-23T02:38:50.7448479Z OK 2022-11-23T02:38:50.7448613Z 2022-11-23T02:38:50.7448739Z Generating XML reports... 2022-11-23T02:38:50.7449287Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022600.xml 2022-11-23T02:38:50.7449937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7450392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7450975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7451451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7451972Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy6p611m7 2022-11-23T02:38:50.7452575Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy6p611m7/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7452878Z 2022-11-23T02:38:50.7452989Z Running tests... 2022-11-23T02:38:50.7453386Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7453921Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7454423Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7454915Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68065 2022-11-23T02:38:50.7455348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68066 2022-11-23T02:38:50.7455963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7456421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7456986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7457464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7458052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7458501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7459067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7459543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7460010Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpknvs3n3v 2022-11-23T02:38:50.7460561Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpknvs3n3v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7461058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7461559Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_tfwsca 2022-11-23T02:38:50.7462097Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_tfwsca/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7462588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7463057Z skip: Need at least 4 CUDA devices (3.961s) 2022-11-23T02:38:50.7463256Z 2022-11-23T02:38:50.7463533Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7463864Z Ran 1 test in 3.961s 2022-11-23T02:38:50.7464008Z 2022-11-23T02:38:50.7464118Z OK (skipped=1) 2022-11-23T02:38:50.7464275Z 2022-11-23T02:38:50.7464399Z Generating XML reports... 2022-11-23T02:38:50.7464942Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022607.xml 2022-11-23T02:38:50.7465591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7466047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7466631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7467105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7467558Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsl9iqq1y 2022-11-23T02:38:50.7468101Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsl9iqq1y/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7468402Z 2022-11-23T02:38:50.7468511Z Running tests... 2022-11-23T02:38:50.7468902Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7469501Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7470005Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7470483Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68168 2022-11-23T02:38:50.7470916Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68169 2022-11-23T02:38:50.7471528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7471987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7472553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7473026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7473611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7474063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7474623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7475526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7476087Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptoci6jzh 2022-11-23T02:38:50.7476642Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptoci6jzh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7477146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7477651Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb6000eue 2022-11-23T02:38:50.7478192Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb6000eue/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7478691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7479182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7479676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7480352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7481031Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7481537Z ok (3.936s) 2022-11-23T02:38:50.7481691Z 2022-11-23T02:38:50.7481963Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7482295Z Ran 1 test in 3.936s 2022-11-23T02:38:50.7482441Z 2022-11-23T02:38:50.7482535Z OK 2022-11-23T02:38:50.7482670Z 2022-11-23T02:38:50.7482801Z Generating XML reports... 2022-11-23T02:38:50.7483348Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022614.xml 2022-11-23T02:38:50.7484000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7484455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7485036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7485513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7485963Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdt73f74a 2022-11-23T02:38:50.7486507Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdt73f74a/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7486811Z 2022-11-23T02:38:50.7486921Z Running tests... 2022-11-23T02:38:50.7487380Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7487930Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7488417Z test_sequence_num_set_gloo_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7488889Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68277 2022-11-23T02:38:50.7489322Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68278 2022-11-23T02:38:50.7489932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7490396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7490957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7491428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7492013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7492482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7493042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7493513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7493979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp12oj27t6 2022-11-23T02:38:50.7494495Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpupe984qs 2022-11-23T02:38:50.7495032Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp12oj27t6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7495580Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpupe984qs/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7496098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7496556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7497036Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7497534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7498198Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7498949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7499484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:38:50.7499981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:38:50.7500624Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7501308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:38:50.7501705Z ok (4.063s) 2022-11-23T02:38:50.7501856Z 2022-11-23T02:38:50.7502126Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7502440Z Ran 1 test in 4.063s 2022-11-23T02:38:50.7502601Z 2022-11-23T02:38:50.7502697Z OK 2022-11-23T02:38:50.7502836Z 2022-11-23T02:38:50.7502962Z Generating XML reports... 2022-11-23T02:38:50.7503491Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022620.xml 2022-11-23T02:38:50.7504163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7504688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7505290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7505746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7506215Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzi03zic8 2022-11-23T02:38:50.7506760Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzi03zic8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7507062Z 2022-11-23T02:38:50.7507178Z Running tests... 2022-11-23T02:38:50.7507571Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7508103Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7508579Z test_tensor_dtype_complex (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7509025Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68392 2022-11-23T02:38:50.7509472Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68393 2022-11-23T02:38:50.7510085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7510537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7511098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7511566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7512159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7512604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7513184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7513652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7514120Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcuw6_ae8 2022-11-23T02:38:50.7514648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcuw6_ae8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7515621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7516153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu4tqaz_y 2022-11-23T02:38:50.7516800Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu4tqaz_y/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7517336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7517829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7518331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7518988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7519683Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7520080Z ok (3.959s) 2022-11-23T02:38:50.7520231Z 2022-11-23T02:38:50.7520501Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7520813Z Ran 1 test in 3.959s 2022-11-23T02:38:50.7520980Z 2022-11-23T02:38:50.7521076Z OK 2022-11-23T02:38:50.7521211Z 2022-11-23T02:38:50.7521336Z Generating XML reports... 2022-11-23T02:38:50.7521866Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022626.xml 2022-11-23T02:38:50.7522531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7523054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7523655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7524113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7524583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpetgpondr 2022-11-23T02:38:50.7525133Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpetgpondr/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7525442Z 2022-11-23T02:38:50.7525553Z Running tests... 2022-11-23T02:38:50.7525945Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7526476Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7526951Z test_tensor_dtype_mismatch (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7527426Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68501 2022-11-23T02:38:50.7527878Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68502 2022-11-23T02:38:50.7528496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7528935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7529518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7529998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7530606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7531039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7531620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7532086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7532550Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpggepv_kt 2022-11-23T02:38:50.7533078Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpggepv_kt/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7533613Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjvv6bmr0 2022-11-23T02:38:50.7534152Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjvv6bmr0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7534715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7535192Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7535675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7536174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7536828Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7537520Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7538570Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.7539206Z warnings.warn( 2022-11-23T02:38:50.7540120Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.7540758Z warnings.warn( 2022-11-23T02:38:50.7541623Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.7542240Z warnings.warn( 2022-11-23T02:38:50.7543079Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.7543722Z warnings.warn( 2022-11-23T02:38:50.7543961Z ok (4.074s) 2022-11-23T02:38:50.7544110Z 2022-11-23T02:38:50.7544384Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7544699Z Ran 1 test in 4.074s 2022-11-23T02:38:50.7544862Z 2022-11-23T02:38:50.7544957Z OK 2022-11-23T02:38:50.7545091Z 2022-11-23T02:38:50.7545216Z Generating XML reports... 2022-11-23T02:38:50.7545743Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022633.xml 2022-11-23T02:38:50.7546415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7546871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7547456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7547914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7548384Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprcvqzc0k 2022-11-23T02:38:50.7548935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprcvqzc0k/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7549236Z 2022-11-23T02:38:50.7549346Z Running tests... 2022-11-23T02:38:50.7549739Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7550274Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7550767Z test_allgather_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7551220Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68610 2022-11-23T02:38:50.7551742Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68611 2022-11-23T02:38:50.7552403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7552864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7553431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7553905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7554490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7554923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7555928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7556401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7556875Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpml5rxozm 2022-11-23T02:38:50.7557408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpml5rxozm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7557924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7558538Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqv81xuld 2022-11-23T02:38:50.7559075Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqv81xuld/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7559590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7560080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7560575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7561231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7561926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7562862Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7563594Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7564429Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7565155Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7565487Z ok (4.087s) 2022-11-23T02:38:50.7565639Z 2022-11-23T02:38:50.7565910Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7566225Z Ran 1 test in 4.087s 2022-11-23T02:38:50.7566405Z 2022-11-23T02:38:50.7566500Z OK 2022-11-23T02:38:50.7566636Z 2022-11-23T02:38:50.7566765Z Generating XML reports... 2022-11-23T02:38:50.7567315Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022639.xml 2022-11-23T02:38:50.7567997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7568455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7569039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7569586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7570051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxb_eg16i 2022-11-23T02:38:50.7570593Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxb_eg16i/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7570893Z 2022-11-23T02:38:50.7571003Z Running tests... 2022-11-23T02:38:50.7571419Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7571959Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7572445Z test_allgather_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7572897Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68719 2022-11-23T02:38:50.7573348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68720 2022-11-23T02:38:50.7573964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7574536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7575103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7575630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7576227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7576655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7577231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7577699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7578168Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnh1ukaaf 2022-11-23T02:38:50.7578696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnh1ukaaf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7579230Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8gffeqwn 2022-11-23T02:38:50.7579769Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8gffeqwn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7580284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7580747Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7581233Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7581727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7582381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7583080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7584014Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7584742Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7585597Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7586297Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7586628Z ok (4.919s) 2022-11-23T02:38:50.7586841Z 2022-11-23T02:38:50.7587112Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7587427Z Ran 1 test in 4.919s 2022-11-23T02:38:50.7587589Z 2022-11-23T02:38:50.7587684Z OK 2022-11-23T02:38:50.7587819Z 2022-11-23T02:38:50.7587946Z Generating XML reports... 2022-11-23T02:38:50.7588513Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022646.xml 2022-11-23T02:38:50.7589183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7589637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7590214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7590669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7591133Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz2imscyz 2022-11-23T02:38:50.7591682Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz2imscyz/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7591982Z 2022-11-23T02:38:50.7592093Z Running tests... 2022-11-23T02:38:50.7592484Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7593082Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7593580Z test_allreduce_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7594053Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68830 2022-11-23T02:38:50.7594512Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68831 2022-11-23T02:38:50.7595470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7595946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7596523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7596995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7597577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7598030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7598588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7599054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7599521Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7v09i3u 2022-11-23T02:38:50.7600048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7v09i3u/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7600586Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqa655e3j 2022-11-23T02:38:50.7601125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqa655e3j/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7601639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7602101Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7602585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7603086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7603735Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7604430Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7605472Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7606197Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7607050Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7607753Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7608599Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7609320Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7610239Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7610950Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7611281Z ok (3.917s) 2022-11-23T02:38:50.7611431Z 2022-11-23T02:38:50.7611701Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7612035Z Ran 1 test in 3.917s 2022-11-23T02:38:50.7612181Z 2022-11-23T02:38:50.7612277Z OK 2022-11-23T02:38:50.7612431Z 2022-11-23T02:38:50.7612557Z Generating XML reports... 2022-11-23T02:38:50.7613117Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022653.xml 2022-11-23T02:38:50.7613789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7614247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7614833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7615311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7615764Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnci4cu6v 2022-11-23T02:38:50.7616309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnci4cu6v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7616615Z 2022-11-23T02:38:50.7616727Z Running tests... 2022-11-23T02:38:50.7617121Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7617658Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7618143Z test_allreduce_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7618615Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68939 2022-11-23T02:38:50.7619049Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68940 2022-11-23T02:38:50.7619663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7620119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7620684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7621159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7621743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7622259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7622820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7623292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7623789Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjkp8kdla 2022-11-23T02:38:50.7624339Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjkp8kdla/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7624838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7625346Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx1rejz_0 2022-11-23T02:38:50.7625889Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx1rejz_0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7626386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7626878Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7627374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7628092Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7628784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7629713Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7630455Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7631309Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7632024Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7632858Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7633567Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7634412Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7635451Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7635814Z ok (4.864s) 2022-11-23T02:38:50.7635966Z 2022-11-23T02:38:50.7636245Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7636577Z Ran 1 test in 4.864s 2022-11-23T02:38:50.7636764Z 2022-11-23T02:38:50.7636842Z OK 2022-11-23T02:38:50.7636981Z 2022-11-23T02:38:50.7637108Z Generating XML reports... 2022-11-23T02:38:50.7637678Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022659.xml 2022-11-23T02:38:50.7638359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7638796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7639374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7639954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7640409Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpga57enqn 2022-11-23T02:38:50.7640951Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpga57enqn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7641260Z 2022-11-23T02:38:50.7641371Z Running tests... 2022-11-23T02:38:50.7641784Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7642301Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7642786Z test_broadcast_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7643284Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69050 2022-11-23T02:38:50.7643720Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69051 2022-11-23T02:38:50.7644340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7644793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7645376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7645902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7646506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7646956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7647532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7647986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7648459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnt27d70m 2022-11-23T02:38:50.7649006Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnt27d70m/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7649498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7650002Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy4f54qq3 2022-11-23T02:38:50.7650540Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy4f54qq3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7651051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7651523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7652016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7652732Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7653437Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7654357Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7655087Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7655943Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7656648Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7657021Z ok (4.078s) 2022-11-23T02:38:50.7657173Z 2022-11-23T02:38:50.7657448Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7657780Z Ran 1 test in 4.078s 2022-11-23T02:38:50.7657942Z 2022-11-23T02:38:50.7658020Z OK 2022-11-23T02:38:50.7658153Z 2022-11-23T02:38:50.7658280Z Generating XML reports... 2022-11-23T02:38:50.7658847Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022706.xml 2022-11-23T02:38:50.7659528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7659964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7660539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7661011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7661486Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl_uuvr5z 2022-11-23T02:38:50.7662009Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl_uuvr5z/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7662311Z 2022-11-23T02:38:50.7662421Z Running tests... 2022-11-23T02:38:50.7662881Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7663411Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7663898Z test_broadcast_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7664365Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69159 2022-11-23T02:38:50.7664813Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69160 2022-11-23T02:38:50.7665405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7665869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7666453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7666919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7667510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7667986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7668563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7669014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7669483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy0zus8ib 2022-11-23T02:38:50.7670025Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy0zus8ib/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7670524Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7671028Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1ahheax3 2022-11-23T02:38:50.7671568Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1ahheax3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7672081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7672553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7673051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7673715Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7674407Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7675813Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7676571Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7677427Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7678143Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7678457Z ok (4.921s) 2022-11-23T02:38:50.7678608Z 2022-11-23T02:38:50.7678878Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7679219Z Ran 1 test in 4.921s 2022-11-23T02:38:50.7679383Z 2022-11-23T02:38:50.7679477Z OK 2022-11-23T02:38:50.7679595Z 2022-11-23T02:38:50.7679722Z Generating XML reports... 2022-11-23T02:38:50.7680285Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022713.xml 2022-11-23T02:38:50.7681055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7681508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7682091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7682564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7683033Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbyf0gurh 2022-11-23T02:38:50.7683567Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbyf0gurh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7683871Z 2022-11-23T02:38:50.7683983Z Running tests... 2022-11-23T02:38:50.7684395Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7684910Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7685416Z test_consecutive_comm_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7685898Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69270 2022-11-23T02:38:50.7686348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69271 2022-11-23T02:38:50.7686942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7687396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7687985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7688444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7689029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7689483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7690063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7690516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7690985Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc8ni8kc4 2022-11-23T02:38:50.7691529Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc8ni8kc4/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7692066Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxzwbqd_9 2022-11-23T02:38:50.7692709Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxzwbqd_9/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7693224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7693700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7694177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7694671Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7695340Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7696031Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7696951Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7697673Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7698587Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7699316Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7700167Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7700866Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7701712Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7702426Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7703271Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7703966Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7704806Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7705520Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7706369Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7707085Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7707908Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7708620Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7709013Z ok (3.974s) 2022-11-23T02:38:50.7709164Z 2022-11-23T02:38:50.7709418Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7709752Z Ran 1 test in 3.974s 2022-11-23T02:38:50.7709918Z 2022-11-23T02:38:50.7710013Z OK 2022-11-23T02:38:50.7710149Z 2022-11-23T02:38:50.7710274Z Generating XML reports... 2022-11-23T02:38:50.7710824Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022720.xml 2022-11-23T02:38:50.7711508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7711960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7712526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7713001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7713477Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe6uobnkn 2022-11-23T02:38:50.7714025Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe6uobnkn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7714328Z 2022-11-23T02:38:50.7714421Z Running tests... 2022-11-23T02:38:50.7714891Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7715675Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7716175Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7716638Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69379 2022-11-23T02:38:50.7717092Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69380 2022-11-23T02:38:50.7717707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7718153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7718732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7719204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7719789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7720216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7720793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7721265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7721713Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6y0oih2n 2022-11-23T02:38:50.7722256Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6y0oih2n/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7722774Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7723282Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpplb087yh 2022-11-23T02:38:50.7723811Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpplb087yh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7724322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7724811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7725311Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7725961Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7726756Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7727688Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7728419Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7729258Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7729966Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7730811Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7731529Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7732450Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7733170Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7734011Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7734721Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7735567Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7736263Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7737105Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7737808Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7738645Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7739336Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7739662Z ok (4.828s) 2022-11-23T02:38:50.7739813Z 2022-11-23T02:38:50.7740082Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7740418Z Ran 1 test in 4.828s 2022-11-23T02:38:50.7740566Z 2022-11-23T02:38:50.7740661Z OK 2022-11-23T02:38:50.7740796Z 2022-11-23T02:38:50.7740923Z Generating XML reports... 2022-11-23T02:38:50.7741487Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022726.xml 2022-11-23T02:38:50.7742154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7742608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7743193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7743739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7744191Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp73e7xhka 2022-11-23T02:38:50.7744737Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp73e7xhka/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7745039Z 2022-11-23T02:38:50.7745149Z Running tests... 2022-11-23T02:38:50.7745544Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7746079Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7746572Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7747049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69490 2022-11-23T02:38:50.7747485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69491 2022-11-23T02:38:50.7748102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7748562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7749177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7749662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7750245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7750695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7751254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7751716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7752195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppazrjae0 2022-11-23T02:38:50.7752785Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppazrjae0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7753285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7753792Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph4gxrjdo 2022-11-23T02:38:50.7754334Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph4gxrjdo/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7754825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7755537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7756033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7756711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7757385Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7758320Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7759044Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7759892Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7760603Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7761529Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7762248Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7763093Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7763801Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7764111Z ok (3.957s) 2022-11-23T02:38:50.7764261Z 2022-11-23T02:38:50.7764533Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7764870Z Ran 1 test in 3.957s 2022-11-23T02:38:50.7765033Z 2022-11-23T02:38:50.7765111Z OK 2022-11-23T02:38:50.7765246Z 2022-11-23T02:38:50.7765371Z Generating XML reports... 2022-11-23T02:38:50.7765934Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022733.xml 2022-11-23T02:38:50.7766678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7767130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7767743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7768220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7768670Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5x9t6n19 2022-11-23T02:38:50.7769215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5x9t6n19/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7769521Z 2022-11-23T02:38:50.7769631Z Running tests... 2022-11-23T02:38:50.7770042Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7770558Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7771042Z test_scatter_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7771512Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69599 2022-11-23T02:38:50.7771947Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69600 2022-11-23T02:38:50.7772587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7773046Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7773626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7774088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7774675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7775122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7775706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7776163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7776629Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy_sr1bgd 2022-11-23T02:38:50.7777173Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy_sr1bgd/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7777670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7778282Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk67mx7bx 2022-11-23T02:38:50.7778822Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk67mx7bx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7779327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7779805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7780302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7780971Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7781666Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7782577Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7783310Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7784211Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7784941Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7785257Z ok (3.925s) 2022-11-23T02:38:50.7785408Z 2022-11-23T02:38:50.7785678Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7786038Z Ran 1 test in 3.925s 2022-11-23T02:38:50.7786200Z 2022-11-23T02:38:50.7786276Z OK 2022-11-23T02:38:50.7786411Z 2022-11-23T02:38:50.7786543Z Generating XML reports... 2022-11-23T02:38:50.7787112Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022740.xml 2022-11-23T02:38:50.7787795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7788228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7788811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7789283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7789735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiswf6_e_ 2022-11-23T02:38:50.7790275Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiswf6_e_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7790576Z 2022-11-23T02:38:50.7790687Z Running tests... 2022-11-23T02:38:50.7791099Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7791619Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7792103Z test_scatter_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7792574Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69708 2022-11-23T02:38:50.7793032Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69709 2022-11-23T02:38:50.7793633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7794084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7794666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7795332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7796025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7796478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7797056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7797509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7797978Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnrj0ur5f 2022-11-23T02:38:50.7798528Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnrj0ur5f/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7799027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7799532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu8rqb8cc 2022-11-23T02:38:50.7800078Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu8rqb8cc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7800590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7801060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.7801621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.7802297Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7802990Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.7803906Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7804636Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7805488Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:38:50.7806208Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:38:50.7806522Z ok (4.824s) 2022-11-23T02:38:50.7806675Z 2022-11-23T02:38:50.7806947Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7807283Z Ran 1 test in 4.824s 2022-11-23T02:38:50.7807445Z 2022-11-23T02:38:50.7807540Z OK 2022-11-23T02:38:50.7807658Z 2022-11-23T02:38:50.7807784Z Generating XML reports... 2022-11-23T02:38:50.7808345Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022746.xml 2022-11-23T02:38:50.7809033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7809473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7810054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7810530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7811003Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsk7mh49r 2022-11-23T02:38:50.7811257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsk7mh49r/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7811278Z 2022-11-23T02:38:50.7811389Z Running tests... 2022-11-23T02:38:50.7811662Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7811979Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7812277Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7812650Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7812872Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69819 2022-11-23T02:38:50.7813090Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69820 2022-11-23T02:38:50.7813452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7813633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7814019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7814215Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7814593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7814772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7815154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7815401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7815691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps58i13jp 2022-11-23T02:38:50.7815946Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps58i13jp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7816177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7816434Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4mmyx6mf 2022-11-23T02:38:50.7816708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4mmyx6mf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7816946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7817050Z ok (5.388s) 2022-11-23T02:38:50.7817070Z 2022-11-23T02:38:50.7817391Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7817506Z Ran 1 test in 5.388s 2022-11-23T02:38:50.7817526Z 2022-11-23T02:38:50.7817607Z OK 2022-11-23T02:38:50.7817645Z 2022-11-23T02:38:50.7817754Z Generating XML reports... 2022-11-23T02:38:50.7818222Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022753.xml 2022-11-23T02:38:50.7818599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7818777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7819162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7819364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7819623Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcwpc10cd 2022-11-23T02:38:50.7819898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcwpc10cd/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7819923Z 2022-11-23T02:38:50.7820016Z Running tests... 2022-11-23T02:38:50.7820286Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7820600Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7820844Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7821114Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7821334Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69934 2022-11-23T02:38:50.7821614Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69935 2022-11-23T02:38:50.7821995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7822154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7822545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7822739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7823114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7823289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7823676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7823873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7824132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa95nlet9 2022-11-23T02:38:50.7824403Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa95nlet9/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7824680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7824947Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8zx5r4rm 2022-11-23T02:38:50.7825239Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8zx5r4rm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7825471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7825575Z ok (5.334s) 2022-11-23T02:38:50.7825596Z 2022-11-23T02:38:50.7825868Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7825989Z Ran 1 test in 5.335s 2022-11-23T02:38:50.7826009Z 2022-11-23T02:38:50.7826103Z OK 2022-11-23T02:38:50.7826123Z 2022-11-23T02:38:50.7826231Z Generating XML reports... 2022-11-23T02:38:50.7826698Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022801.xml 2022-11-23T02:38:50.7827079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7827258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7827644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7827838Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7828096Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprw8n304x 2022-11-23T02:38:50.7828366Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprw8n304x/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7828390Z 2022-11-23T02:38:50.7828503Z Running tests... 2022-11-23T02:38:50.7828755Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7829075Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7829324Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7829582Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7829806Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70049 2022-11-23T02:38:50.7830021Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70050 2022-11-23T02:38:50.7830400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7830644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7831018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7831213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7831590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7831783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7832180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7832375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7832633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp58tzeb6q 2022-11-23T02:38:50.7832904Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp58tzeb6q/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7833163Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk9_gudmv 2022-11-23T02:38:50.7833410Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk9_gudmv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7833642Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7833922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7834171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7834409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7834637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7834864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7836006Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7836131Z warnings.warn( 2022-11-23T02:38:50.7837080Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7837180Z warnings.warn( 2022-11-23T02:38:50.7837417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7837652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7837883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7838113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7838344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7838572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7838797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7839004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7839108Z ok (5.458s) 2022-11-23T02:38:50.7839130Z 2022-11-23T02:38:50.7839400Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7839514Z Ran 1 test in 5.458s 2022-11-23T02:38:50.7839611Z 2022-11-23T02:38:50.7839716Z OK 2022-11-23T02:38:50.7839736Z 2022-11-23T02:38:50.7839861Z Generating XML reports... 2022-11-23T02:38:50.7840335Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022809.xml 2022-11-23T02:38:50.7840715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7840894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7841263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7841463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7841721Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv_0843bx 2022-11-23T02:38:50.7841990Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv_0843bx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7842014Z 2022-11-23T02:38:50.7842129Z Running tests... 2022-11-23T02:38:50.7842400Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7842715Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7842956Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7843257Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7843492Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70164 2022-11-23T02:38:50.7843704Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70165 2022-11-23T02:38:50.7844084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7844262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7844655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7844867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7845242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7845425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7845793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7845988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7846249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4co8frhr 2022-11-23T02:38:50.7846523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4co8frhr/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7846775Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt_luso3i 2022-11-23T02:38:50.7847044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt_luso3i/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7847276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7847507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7847729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7847964Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7848195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7848427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7849344Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7849520Z warnings.warn( 2022-11-23T02:38:50.7850438Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7850553Z warnings.warn( 2022-11-23T02:38:50.7850793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7851031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7851265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7851479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7851704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7851978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7852212Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7852474Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7852581Z ok (5.565s) 2022-11-23T02:38:50.7852602Z 2022-11-23T02:38:50.7852876Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7852990Z Ran 1 test in 5.565s 2022-11-23T02:38:50.7853009Z 2022-11-23T02:38:50.7853085Z OK 2022-11-23T02:38:50.7853108Z 2022-11-23T02:38:50.7853237Z Generating XML reports... 2022-11-23T02:38:50.7853704Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022816.xml 2022-11-23T02:38:50.7854083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7854267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7854654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7854852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7855111Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4h5rzhkc 2022-11-23T02:38:50.7855366Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4h5rzhkc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7855404Z 2022-11-23T02:38:50.7855502Z Running tests... 2022-11-23T02:38:50.7855770Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7856084Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7856355Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7856714Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7856937Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70279 2022-11-23T02:38:50.7857152Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70280 2022-11-23T02:38:50.7857530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7857691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7858075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7858327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7858701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7858877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7859265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7859459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7859719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz3wuen0t 2022-11-23T02:38:50.7859976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz3wuen0t/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7860232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp48sfetz3 2022-11-23T02:38:50.7860500Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp48sfetz3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7860732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7860962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7861247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7861493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7861723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7861954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7862041Z ok (5.406s) 2022-11-23T02:38:50.7862061Z 2022-11-23T02:38:50.7862332Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7862450Z Ran 1 test in 5.406s 2022-11-23T02:38:50.7862470Z 2022-11-23T02:38:50.7862564Z OK 2022-11-23T02:38:50.7862583Z 2022-11-23T02:38:50.7862709Z Generating XML reports... 2022-11-23T02:38:50.7863178Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022824.xml 2022-11-23T02:38:50.7863559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7863739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7864107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7864303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7864560Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp48kbrmm8 2022-11-23T02:38:50.7864832Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp48kbrmm8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7864856Z 2022-11-23T02:38:50.7864965Z Running tests... 2022-11-23T02:38:50.7865236Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7865554Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7865824Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7866179Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7866385Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70394 2022-11-23T02:38:50.7866601Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70395 2022-11-23T02:38:50.7866980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7867218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7867607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7867803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7868180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7868356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7868723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7868919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7869177Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6x1jq7t5 2022-11-23T02:38:50.7869449Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6x1jq7t5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7869707Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoz74jm2m 2022-11-23T02:38:50.7869972Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoz74jm2m/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7870204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7870479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7870729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7870948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7871178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7871407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7871516Z ok (5.461s) 2022-11-23T02:38:50.7871537Z 2022-11-23T02:38:50.7871810Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7871924Z Ran 1 test in 5.461s 2022-11-23T02:38:50.7871944Z 2022-11-23T02:38:50.7872037Z OK 2022-11-23T02:38:50.7872056Z 2022-11-23T02:38:50.7872183Z Generating XML reports... 2022-11-23T02:38:50.7872636Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022832.xml 2022-11-23T02:38:50.7873020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7873196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7873581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7873778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7874037Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyq58nj4j 2022-11-23T02:38:50.7874316Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyq58nj4j/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7874336Z 2022-11-23T02:38:50.7874445Z Running tests... 2022-11-23T02:38:50.7874713Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7875221Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7875490Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7875874Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7876097Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70509 2022-11-23T02:38:50.7876311Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70510 2022-11-23T02:38:50.7876781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7876958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7877340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7877521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7877896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7878073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7878457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7878650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7878911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmhgp9dps 2022-11-23T02:38:50.7879193Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmhgp9dps/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7879426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7879683Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphtv1xla8 2022-11-23T02:38:50.7879997Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphtv1xla8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7880243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7880480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7880718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7881499Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:38:50.7882270Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:38:50.7882512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7882744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7882847Z ok (5.543s) 2022-11-23T02:38:50.7882867Z 2022-11-23T02:38:50.7883141Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7883255Z Ran 1 test in 5.543s 2022-11-23T02:38:50.7883278Z 2022-11-23T02:38:50.7883373Z OK 2022-11-23T02:38:50.7883393Z 2022-11-23T02:38:50.7883500Z Generating XML reports... 2022-11-23T02:38:50.7883971Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022840.xml 2022-11-23T02:38:50.7884350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7884529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7884914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7885170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7885430Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg4tq50rs 2022-11-23T02:38:50.7885708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg4tq50rs/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7885728Z 2022-11-23T02:38:50.7885837Z Running tests... 2022-11-23T02:38:50.7886089Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7886406Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7886651Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7887030Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7887256Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70624 2022-11-23T02:38:50.7887471Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70625 2022-11-23T02:38:50.7887849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7888029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7888458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7888663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7889038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7889215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7889596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7889795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7890053Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgmv1kr3u 2022-11-23T02:38:50.7890327Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgmv1kr3u/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7890544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7890801Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmbfl9tkx 2022-11-23T02:38:50.7891071Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmbfl9tkx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7891297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7891404Z ok (5.363s) 2022-11-23T02:38:50.7891425Z 2022-11-23T02:38:50.7891701Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7891821Z Ran 1 test in 5.363s 2022-11-23T02:38:50.7891840Z 2022-11-23T02:38:50.7891934Z OK 2022-11-23T02:38:50.7891953Z 2022-11-23T02:38:50.7892079Z Generating XML reports... 2022-11-23T02:38:50.7892550Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022848.xml 2022-11-23T02:38:50.7892929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7893089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7893473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7893664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7893921Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6p8fft0h 2022-11-23T02:38:50.7894257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6p8fft0h/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7894278Z 2022-11-23T02:38:50.7894388Z Running tests... 2022-11-23T02:38:50.7894659Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7894974Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7895202Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7895478Z Checkpointing should work with static graph in the case of checkpointing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7895699Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70739 2022-11-23T02:38:50.7895913Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70740 2022-11-23T02:38:50.7896293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7896477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7896862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7897058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7897477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7897645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7898029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7898222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7898480Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp249kg3nk 2022-11-23T02:38:50.7898752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp249kg3nk/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7898989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7899246Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpucyvyafs 2022-11-23T02:38:50.7899515Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpucyvyafs/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7899749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7899971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7900211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7900445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7900675Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7900785Z ok (5.356s) 2022-11-23T02:38:50.7900806Z 2022-11-23T02:38:50.7901078Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7901197Z Ran 1 test in 5.356s 2022-11-23T02:38:50.7901216Z 2022-11-23T02:38:50.7901313Z OK 2022-11-23T02:38:50.7901332Z 2022-11-23T02:38:50.7901440Z Generating XML reports... 2022-11-23T02:38:50.7901912Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022855.xml 2022-11-23T02:38:50.7902287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7902465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7902849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7903044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7903364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4hw4m83a 2022-11-23T02:38:50.7903635Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4hw4m83a/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7903656Z 2022-11-23T02:38:50.7903767Z Running tests... 2022-11-23T02:38:50.7904017Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7904337Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7904599Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7904874Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7905095Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70854 2022-11-23T02:38:50.7905309Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70855 2022-11-23T02:38:50.7905691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7905872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7906237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7906481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7906864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7907041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7907423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7907618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7907876Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4w7laqrz 2022-11-23T02:38:50.7908153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4w7laqrz/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7908409Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpft2kkc0c 2022-11-23T02:38:50.7908661Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpft2kkc0c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7908896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7909127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7909909Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:38:50.7910680Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:38:50.7911596Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7911767Z warnings.warn( 2022-11-23T02:38:50.7912675Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7912792Z warnings.warn( 2022-11-23T02:38:50.7913033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7913273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7913509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7913743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7913828Z ok (5.493s) 2022-11-23T02:38:50.7913868Z 2022-11-23T02:38:50.7914122Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7914238Z Ran 1 test in 5.494s 2022-11-23T02:38:50.7914259Z 2022-11-23T02:38:50.7914354Z OK 2022-11-23T02:38:50.7914418Z 2022-11-23T02:38:50.7914552Z Generating XML reports... 2022-11-23T02:38:50.7915220Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022903.xml 2022-11-23T02:38:50.7915617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7915798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7916185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7916367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7916629Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa6gsy8ts 2022-11-23T02:38:50.7916901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa6gsy8ts/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7916922Z 2022-11-23T02:38:50.7917037Z Running tests... 2022-11-23T02:38:50.7917308Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7917625Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7917883Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7918160Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7918364Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70969 2022-11-23T02:38:50.7918585Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70970 2022-11-23T02:38:50.7918965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7919144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7919536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7919732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7920109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7920288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7920674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7920936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7921195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn9d7vzhs 2022-11-23T02:38:50.7921468Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn9d7vzhs/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7921704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7921958Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmposo1q200 2022-11-23T02:38:50.7922227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmposo1q200/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7922458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7923372Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7923495Z warnings.warn( 2022-11-23T02:38:50.7924457Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.7924560Z warnings.warn( 2022-11-23T02:38:50.7924800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7925039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7925269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7925502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7925607Z ok (5.373s) 2022-11-23T02:38:50.7925627Z 2022-11-23T02:38:50.7925899Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7926013Z Ran 1 test in 5.373s 2022-11-23T02:38:50.7926036Z 2022-11-23T02:38:50.7926135Z OK 2022-11-23T02:38:50.7926155Z 2022-11-23T02:38:50.7926263Z Generating XML reports... 2022-11-23T02:38:50.7926729Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022911.xml 2022-11-23T02:38:50.7927109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7927290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7927674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7927876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7928133Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8fl1kglq 2022-11-23T02:38:50.7928406Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8fl1kglq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7928429Z 2022-11-23T02:38:50.7928522Z Running tests... 2022-11-23T02:38:50.7928790Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7929104Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7929367Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7929607Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7929828Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71084 2022-11-23T02:38:50.7930106Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71085 2022-11-23T02:38:50.7930484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7930662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7931035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7931230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7931606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7931784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7932166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7932363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7932621Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg2ea6cfa 2022-11-23T02:38:50.7932898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg2ea6cfa/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7933193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7933459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw0i6sr09 2022-11-23T02:38:50.7933730Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw0i6sr09/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7933959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7934196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7934432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7934667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7934894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7934981Z ok (5.329s) 2022-11-23T02:38:50.7935020Z 2022-11-23T02:38:50.7935278Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7935393Z Ran 1 test in 5.329s 2022-11-23T02:38:50.7935413Z 2022-11-23T02:38:50.7935514Z OK 2022-11-23T02:38:50.7935534Z 2022-11-23T02:38:50.7935661Z Generating XML reports... 2022-11-23T02:38:50.7936126Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022919.xml 2022-11-23T02:38:50.7936502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7936681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7937071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7937249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7937508Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzdbq8ttu 2022-11-23T02:38:50.7937787Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzdbq8ttu/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7937807Z 2022-11-23T02:38:50.7937919Z Running tests... 2022-11-23T02:38:50.7938186Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7938501Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7938764Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7939004Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7939288Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71199 2022-11-23T02:38:50.7939485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71200 2022-11-23T02:38:50.7939865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7940048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7940420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7940597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7940978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7941176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7941560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7941732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7941996Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu6j43lhw 2022-11-23T02:38:50.7942312Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu6j43lhw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7942574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpftij08x_ 2022-11-23T02:38:50.7942840Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpftij08x_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7943075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7943303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7943542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7943783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7943997Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7944227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7944457Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7944687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7944924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7945158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.7945263Z ok (5.477s) 2022-11-23T02:38:50.7945284Z 2022-11-23T02:38:50.7945556Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7945657Z Ran 1 test in 5.477s 2022-11-23T02:38:50.7945676Z 2022-11-23T02:38:50.7945771Z OK 2022-11-23T02:38:50.7945791Z 2022-11-23T02:38:50.7945917Z Generating XML reports... 2022-11-23T02:38:50.7946385Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022926.xml 2022-11-23T02:38:50.7946768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7946947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7947335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7947530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7947772Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq6bkfnr0 2022-11-23T02:38:50.7948110Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq6bkfnr0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7948130Z 2022-11-23T02:38:50.7948240Z Running tests... 2022-11-23T02:38:50.7948508Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7948820Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7949051Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7949321Z This unit test verifies whether the Future object is passed properly. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7949540Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71314 2022-11-23T02:38:50.7949754Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71315 2022-11-23T02:38:50.7950113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7950295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7950680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7950875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7951302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7951486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7951876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7952070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7952311Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4djl7aby 2022-11-23T02:38:50.7952624Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4djl7aby/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7952889Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpknjbdkux 2022-11-23T02:38:50.7953158Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpknjbdkux/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7953388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7953624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7953730Z ok (3.902s) 2022-11-23T02:38:50.7953750Z 2022-11-23T02:38:50.7954024Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7954138Z Ran 1 test in 3.902s 2022-11-23T02:38:50.7954158Z 2022-11-23T02:38:50.7954235Z OK 2022-11-23T02:38:50.7954254Z 2022-11-23T02:38:50.7954380Z Generating XML reports... 2022-11-23T02:38:50.7954847Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022934.xml 2022-11-23T02:38:50.7955434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7955615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7956006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7956205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7956466Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyqp7h94w 2022-11-23T02:38:50.7956722Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyqp7h94w/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7956760Z 2022-11-23T02:38:50.7956854Z Running tests... 2022-11-23T02:38:50.7957124Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7957441Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7957758Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7958055Z This unit test verifies whether the Future object is passed properly using gloo backend. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7958275Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71427 2022-11-23T02:38:50.7958495Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71428 2022-11-23T02:38:50.7958872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7959032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7959418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7959614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7959989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7960169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7960554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7960802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7961071Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptxhr0s5z 2022-11-23T02:38:50.7961329Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptxhr0s5z/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7961582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3u9meb78 2022-11-23T02:38:50.7961845Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3u9meb78/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7962082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7962309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7962413Z ok (4.965s) 2022-11-23T02:38:50.7962433Z 2022-11-23T02:38:50.7962705Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7962821Z Ran 1 test in 4.966s 2022-11-23T02:38:50.7962845Z 2022-11-23T02:38:50.7962941Z OK 2022-11-23T02:38:50.7962960Z 2022-11-23T02:38:50.7963068Z Generating XML reports... 2022-11-23T02:38:50.7963538Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022940.xml 2022-11-23T02:38:50.7963915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7964092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7964477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7964676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7964936Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwjoihqk5 2022-11-23T02:38:50.7965213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwjoihqk5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7965234Z 2022-11-23T02:38:50.7965343Z Running tests... 2022-11-23T02:38:50.7965594Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7965907Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7966133Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7966416Z DDP communication hook can only be registered once. This test validates whether ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7966692Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71542 2022-11-23T02:38:50.7966907Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71543 2022-11-23T02:38:50.7967283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7967463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7967835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7968030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7968404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7968584Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7968964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7969162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7969423Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqxrx5fin 2022-11-23T02:38:50.7969699Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqxrx5fin/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7969956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7970223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprypti0aa 2022-11-23T02:38:50.7970494Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprypti0aa/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7970724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7970828Z ok (4.049s) 2022-11-23T02:38:50.7970849Z 2022-11-23T02:38:50.7971127Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7971246Z Ran 1 test in 4.049s 2022-11-23T02:38:50.7971266Z 2022-11-23T02:38:50.7971361Z OK 2022-11-23T02:38:50.7971380Z 2022-11-23T02:38:50.7971508Z Generating XML reports... 2022-11-23T02:38:50.7971958Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022948.xml 2022-11-23T02:38:50.7972337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7972515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7972902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7973096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7973353Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprm076n7a 2022-11-23T02:38:50.7973629Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprm076n7a/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7973649Z 2022-11-23T02:38:50.7973760Z Running tests... 2022-11-23T02:38:50.7974013Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7974379Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7974611Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7974892Z Runs "test_sparse_gradients" unit test with DDP communication hook. We define a ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7975115Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71651 2022-11-23T02:38:50.7975328Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71652 2022-11-23T02:38:50.7975707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7975944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7976333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7976507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7976882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7977059Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7977442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7977636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7977894Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvhfpxdre 2022-11-23T02:38:50.7978169Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvhfpxdre/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7978428Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfqkxxr5e 2022-11-23T02:38:50.7978677Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfqkxxr5e/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7978910Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7979183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7979293Z ok (3.959s) 2022-11-23T02:38:50.7979314Z 2022-11-23T02:38:50.7979585Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7979701Z Ran 1 test in 3.959s 2022-11-23T02:38:50.7979720Z 2022-11-23T02:38:50.7979816Z OK 2022-11-23T02:38:50.7979835Z 2022-11-23T02:38:50.7979961Z Generating XML reports... 2022-11-23T02:38:50.7980428Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022954.xml 2022-11-23T02:38:50.7980791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7980968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7981356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7981554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7981814Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk0d66j66 2022-11-23T02:38:50.7982086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk0d66j66/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7982107Z 2022-11-23T02:38:50.7982217Z Running tests... 2022-11-23T02:38:50.7982485Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7982782Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7983004Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7983282Z This unit test makes sure that register_comm_hook properly checks the format ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7983505Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71794 2022-11-23T02:38:50.7983721Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71795 2022-11-23T02:38:50.7984100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7984278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7984664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7984857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7985274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7985450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7985834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7986034Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7986291Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0jxop2e2 2022-11-23T02:38:50.7986569Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0jxop2e2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7986824Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1z6m11jm 2022-11-23T02:38:50.7987088Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1z6m11jm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7987302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7987537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7987643Z ok (4.069s) 2022-11-23T02:38:50.7987664Z 2022-11-23T02:38:50.7987938Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7988053Z Ran 1 test in 4.070s 2022-11-23T02:38:50.7988118Z 2022-11-23T02:38:50.7988219Z OK 2022-11-23T02:38:50.7988238Z 2022-11-23T02:38:50.7988365Z Generating XML reports... 2022-11-23T02:38:50.7988835Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023000.xml 2022-11-23T02:38:50.7989213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7989373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7989757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7989955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7990212Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvyin_t32 2022-11-23T02:38:50.7990486Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvyin_t32/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7990505Z 2022-11-23T02:38:50.7990617Z Running tests... 2022-11-23T02:38:50.7990887Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7991205Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.7991417Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.7991699Z This test checks whether return annotation checked properly if defined. It also ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.7991923Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71903 2022-11-23T02:38:50.7992137Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71904 2022-11-23T02:38:50.7992517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7992695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7993084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7993279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7993650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7993809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7994190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7994437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7994697Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyee8byel 2022-11-23T02:38:50.7994974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyee8byel/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7995412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.7995669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6sc_aw4q 2022-11-23T02:38:50.7995939Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6sc_aw4q/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7996153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.7996257Z ok (4.054s) 2022-11-23T02:38:50.7996278Z 2022-11-23T02:38:50.7996555Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7996674Z Ran 1 test in 4.054s 2022-11-23T02:38:50.7996694Z 2022-11-23T02:38:50.7996790Z OK 2022-11-23T02:38:50.7996810Z 2022-11-23T02:38:50.7996934Z Generating XML reports... 2022-11-23T02:38:50.7997401Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023007.xml 2022-11-23T02:38:50.7997856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.7998044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.7998416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.7998608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.7998864Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp38mgo3m4 2022-11-23T02:38:50.7999140Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp38mgo3m4/_remote_module_non_scriptable.py 2022-11-23T02:38:50.7999161Z 2022-11-23T02:38:50.7999272Z Running tests... 2022-11-23T02:38:50.7999543Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.7999860Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8000123Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.8000378Z An empty unused_parameters array does not imply find_unused_parameters = ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8000602Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72016 2022-11-23T02:38:50.8000817Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72017 2022-11-23T02:38:50.8001197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8001380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8001771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8001966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8002341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8002518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8002884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8003079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8003339Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpst4r6uvr 2022-11-23T02:38:50.8003615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpst4r6uvr/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8003917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8004174Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx6nz3v6c 2022-11-23T02:38:50.8004451Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx6nz3v6c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8004683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8005462Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:38:50.8005574Z ok (4.823s) 2022-11-23T02:38:50.8005596Z 2022-11-23T02:38:50.8005853Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8005966Z Ran 1 test in 4.823s 2022-11-23T02:38:50.8005985Z 2022-11-23T02:38:50.8006079Z OK 2022-11-23T02:38:50.8006099Z 2022-11-23T02:38:50.8006272Z Generating XML reports... 2022-11-23T02:38:50.8006749Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023013.xml 2022-11-23T02:38:50.8007124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8007305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8007688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8007889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8008132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp27au27wt 2022-11-23T02:38:50.8008403Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp27au27wt/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8008424Z 2022-11-23T02:38:50.8008534Z Running tests... 2022-11-23T02:38:50.8008809Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8009127Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8009420Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8009644Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72131 2022-11-23T02:38:50.8009857Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72132 2022-11-23T02:38:50.8010220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8010398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8010784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8010981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8011353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8011530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8011911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8012104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8012363Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoh9p0l2q 2022-11-23T02:38:50.8012675Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoh9p0l2q/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8012928Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0hmg0p7o 2022-11-23T02:38:50.8013203Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0hmg0p7o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8013438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8013671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8013775Z ok (4.856s) 2022-11-23T02:38:50.8013795Z 2022-11-23T02:38:50.8014067Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8014182Z Ran 1 test in 4.856s 2022-11-23T02:38:50.8014202Z 2022-11-23T02:38:50.8014278Z OK 2022-11-23T02:38:50.8014296Z 2022-11-23T02:38:50.8014422Z Generating XML reports... 2022-11-23T02:38:50.8014895Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023020.xml 2022-11-23T02:38:50.8015274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8015455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8015897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8016102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8016361Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb9t59ri5 2022-11-23T02:38:50.8016633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb9t59ri5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8016654Z 2022-11-23T02:38:50.8016748Z Running tests... 2022-11-23T02:38:50.8017018Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8017338Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8017654Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8017880Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72246 2022-11-23T02:38:50.8018100Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72247 2022-11-23T02:38:50.8018478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8018656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8019021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8019215Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8019591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8019771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8020154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8020351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8020612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp57e45dhe 2022-11-23T02:38:50.8020888Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp57e45dhe/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8021117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8021356Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnm8zm06u 2022-11-23T02:38:50.8021701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnm8zm06u/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8021931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8022034Z ok (4.872s) 2022-11-23T02:38:50.8022055Z 2022-11-23T02:38:50.8022326Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8022445Z Ran 1 test in 4.872s 2022-11-23T02:38:50.8022465Z 2022-11-23T02:38:50.8022561Z OK 2022-11-23T02:38:50.8022580Z 2022-11-23T02:38:50.8022708Z Generating XML reports... 2022-11-23T02:38:50.8023154Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023028.xml 2022-11-23T02:38:50.8023533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8023711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8024099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8024294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8024551Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmponh7_gb8 2022-11-23T02:38:50.8024868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmponh7_gb8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8024890Z 2022-11-23T02:38:50.8025006Z Running tests... 2022-11-23T02:38:50.8025275Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8025574Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8025889Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8026112Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72361 2022-11-23T02:38:50.8026334Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72362 2022-11-23T02:38:50.8026712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8026891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8027281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8027476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8027833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8028010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8028394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8028592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8028851Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz4s_wjak 2022-11-23T02:38:50.8029124Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz4s_wjak/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8029359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8029618Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppcgcmc8w 2022-11-23T02:38:50.8029890Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppcgcmc8w/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8030103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8031020Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.8031197Z warnings.warn( 2022-11-23T02:38:50.8032117Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:38:50.8032233Z warnings.warn( 2022-11-23T02:38:50.8032334Z ok (4.870s) 2022-11-23T02:38:50.8032355Z 2022-11-23T02:38:50.8032627Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8032743Z Ran 1 test in 4.870s 2022-11-23T02:38:50.8032763Z 2022-11-23T02:38:50.8032857Z OK 2022-11-23T02:38:50.8032880Z 2022-11-23T02:38:50.8033011Z Generating XML reports... 2022-11-23T02:38:50.8033457Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023035.xml 2022-11-23T02:38:50.8033834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8034057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8034450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8034646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8034905Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp46fkzg9n 2022-11-23T02:38:50.8035381Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp46fkzg9n/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8035402Z 2022-11-23T02:38:50.8035520Z Running tests... 2022-11-23T02:38:50.8035793Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8036093Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8036407Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8036631Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72476 2022-11-23T02:38:50.8036851Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72477 2022-11-23T02:38:50.8037229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8037407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8037790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8037987Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8038344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8038523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8038914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8039110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8039369Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfkivi0ne 2022-11-23T02:38:50.8039646Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfkivi0ne/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8039880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8040134Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkpqqm81v 2022-11-23T02:38:50.8068659Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkpqqm81v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8068965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8069215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8069464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8069554Z ok (5.353s) 2022-11-23T02:38:50.8069596Z 2022-11-23T02:38:50.8069891Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8070048Z Ran 1 test in 5.353s 2022-11-23T02:38:50.8070068Z 2022-11-23T02:38:50.8070165Z OK 2022-11-23T02:38:50.8070185Z 2022-11-23T02:38:50.8070315Z Generating XML reports... 2022-11-23T02:38:50.8070796Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023042.xml 2022-11-23T02:38:50.8071188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8071375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8071769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8072082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8072365Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdzntp_fu 2022-11-23T02:38:50.8072643Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdzntp_fu/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8072664Z 2022-11-23T02:38:50.8072777Z Running tests... 2022-11-23T02:38:50.8073050Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8073371Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8073701Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8073928Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72591 2022-11-23T02:38:50.8074154Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72592 2022-11-23T02:38:50.8074520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8074699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8075386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8075593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8075974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8076160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8076545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8076737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8076985Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyc5f19tm 2022-11-23T02:38:50.8077255Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyc5f19tm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8077475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8077729Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7e5aozll 2022-11-23T02:38:50.8077989Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7e5aozll/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8078307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8078535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8078755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8078842Z ok (5.365s) 2022-11-23T02:38:50.8078870Z 2022-11-23T02:38:50.8079134Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8079237Z Ran 1 test in 5.365s 2022-11-23T02:38:50.8079257Z 2022-11-23T02:38:50.8079341Z OK 2022-11-23T02:38:50.8079360Z 2022-11-23T02:38:50.8079481Z Generating XML reports... 2022-11-23T02:38:50.8079950Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023050.xml 2022-11-23T02:38:50.8080330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8080514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8080899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8081078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8081397Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb0mvsz6g 2022-11-23T02:38:50.8081678Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb0mvsz6g/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8081698Z 2022-11-23T02:38:50.8081810Z Running tests... 2022-11-23T02:38:50.8082084Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8082401Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8082681Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8082908Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72706 2022-11-23T02:38:50.8083128Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72707 2022-11-23T02:38:50.8083492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8083675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8084059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8084254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8084630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8084808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8085185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8085383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8085631Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp7zad0pq 2022-11-23T02:38:50.8085906Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp7zad0pq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8086167Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu5d1h56o 2022-11-23T02:38:50.8086401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8086672Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu5d1h56o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8086903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8087060Z skip: Need at least 4 CUDA devices (3.950s) 2022-11-23T02:38:50.8087081Z 2022-11-23T02:38:50.8087412Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8087531Z Ran 1 test in 3.951s 2022-11-23T02:38:50.8087551Z 2022-11-23T02:38:50.8087643Z OK (skipped=1) 2022-11-23T02:38:50.8087662Z 2022-11-23T02:38:50.8087789Z Generating XML reports... 2022-11-23T02:38:50.8088264Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023057.xml 2022-11-23T02:38:50.8088644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8088825Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8089210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8089406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8089663Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptffqy5tf 2022-11-23T02:38:50.8089924Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptffqy5tf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8089962Z 2022-11-23T02:38:50.8090056Z Running tests... 2022-11-23T02:38:50.8090324Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8090691Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8090977Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8091199Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72809 2022-11-23T02:38:50.8091420Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72810 2022-11-23T02:38:50.8091801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8091980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8092353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8092551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8092925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8093102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8093486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8093678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8093939Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphb96msvj 2022-11-23T02:38:50.8094213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphb96msvj/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8094433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8094693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcd9cukbr 2022-11-23T02:38:50.8094965Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcd9cukbr/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8095199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8095352Z skip: Need at least 8 CUDA devices (4.059s) 2022-11-23T02:38:50.8095373Z 2022-11-23T02:38:50.8095639Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8095754Z Ran 1 test in 4.059s 2022-11-23T02:38:50.8095774Z 2022-11-23T02:38:50.8095885Z OK (skipped=1) 2022-11-23T02:38:50.8095904Z 2022-11-23T02:38:50.8096030Z Generating XML reports... 2022-11-23T02:38:50.8096480Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023104.xml 2022-11-23T02:38:50.8096919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8097096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8097486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8097681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8097940Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy561re9w 2022-11-23T02:38:50.8098210Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy561re9w/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8098229Z 2022-11-23T02:38:50.8098341Z Running tests... 2022-11-23T02:38:50.8098588Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8098905Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8099189Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8099412Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72912 2022-11-23T02:38:50.8099630Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72913 2022-11-23T02:38:50.8100054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8100239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8100627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8100822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8101172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8101354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8101734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8101929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8102194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe2alicqe 2022-11-23T02:38:50.8102471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe2alicqe/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8102727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb__7a28s 2022-11-23T02:38:50.8102995Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb__7a28s/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8103210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8103441Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8103681Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8103922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8104025Z ok (4.017s) 2022-11-23T02:38:50.8104045Z 2022-11-23T02:38:50.8104320Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8104438Z Ran 1 test in 4.018s 2022-11-23T02:38:50.8104457Z 2022-11-23T02:38:50.8104553Z OK 2022-11-23T02:38:50.8104572Z 2022-11-23T02:38:50.8104697Z Generating XML reports... 2022-11-23T02:38:50.8105148Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023110.xml 2022-11-23T02:38:50.8105528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8105814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8106201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8106394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8106653Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpav3znra0 2022-11-23T02:38:50.8106932Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpav3znra0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8106953Z 2022-11-23T02:38:50.8107064Z Running tests... 2022-11-23T02:38:50.8107316Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8107635Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8107932Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8108157Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73025 2022-11-23T02:38:50.8108377Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73026 2022-11-23T02:38:50.8108756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8109002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8109400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8109592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8109945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8110123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8110503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8110706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8110964Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9o5mjmr0 2022-11-23T02:38:50.8111237Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9o5mjmr0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8111475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8111735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4l7jw_a7 2022-11-23T02:38:50.8111986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4l7jw_a7/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8112217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8112457Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8112700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8112806Z ok (3.977s) 2022-11-23T02:38:50.8112827Z 2022-11-23T02:38:50.8113101Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8113219Z Ran 1 test in 3.978s 2022-11-23T02:38:50.8113238Z 2022-11-23T02:38:50.8113334Z OK 2022-11-23T02:38:50.8113353Z 2022-11-23T02:38:50.8113485Z Generating XML reports... 2022-11-23T02:38:50.8113934Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023117.xml 2022-11-23T02:38:50.8114312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8114490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8114877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8115363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8115632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpii4xrz6b 2022-11-23T02:38:50.8115909Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpii4xrz6b/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8115930Z 2022-11-23T02:38:50.8116044Z Running tests... 2022-11-23T02:38:50.8116306Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8116624Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8116826Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.8117085Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8117350Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73138 2022-11-23T02:38:50.8117574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73139 2022-11-23T02:38:50.8117952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8118130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8118600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8118790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8119163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8119339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8119711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8119893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8120158Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnvat7h9c 2022-11-23T02:38:50.8120432Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnvat7h9c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8120689Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdjihs8yb 2022-11-23T02:38:50.8120949Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdjihs8yb/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8121185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8121412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8121650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8121887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8121997Z ok (4.039s) 2022-11-23T02:38:50.8122018Z 2022-11-23T02:38:50.8122288Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8122401Z Ran 1 test in 4.039s 2022-11-23T02:38:50.8122420Z 2022-11-23T02:38:50.8122514Z OK 2022-11-23T02:38:50.8122534Z 2022-11-23T02:38:50.8122642Z Generating XML reports... 2022-11-23T02:38:50.8123109Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023123.xml 2022-11-23T02:38:50.8123489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8123666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8124045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8124238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8124572Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsqk4qyrh 2022-11-23T02:38:50.8124842Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsqk4qyrh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8124861Z 2022-11-23T02:38:50.8124954Z Running tests... 2022-11-23T02:38:50.8125223Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8125541Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8125771Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:38:50.8126029Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8126251Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73281 2022-11-23T02:38:50.8126472Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73282 2022-11-23T02:38:50.8126853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8127036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8127405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8127646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8128030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8128206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8128584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8128774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8129035Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzayhs1qg 2022-11-23T02:38:50.8129313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzayhs1qg/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8129529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8129783Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwuki6tr1 2022-11-23T02:38:50.8130055Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwuki6tr1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8130286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8130386Z ok (4.033s) 2022-11-23T02:38:50.8130406Z 2022-11-23T02:38:50.8130672Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8130784Z Ran 1 test in 4.033s 2022-11-23T02:38:50.8130804Z 2022-11-23T02:38:50.8130898Z OK 2022-11-23T02:38:50.8130917Z 2022-11-23T02:38:50.8131042Z Generating XML reports... 2022-11-23T02:38:50.8131498Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023129.xml 2022-11-23T02:38:50.8131870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8132038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8132418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8132613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8132866Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplzeym2ub 2022-11-23T02:38:50.8133141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplzeym2ub/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8133162Z 2022-11-23T02:38:50.8133268Z Running tests... 2022-11-23T02:38:50.8133519Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8133886Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8134161Z test_ignored_sharded_tensor (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8134374Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73424 2022-11-23T02:38:50.8134596Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73425 2022-11-23T02:38:50.8134974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8135153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8135537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8135729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8136090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8136264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8137591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8137848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8138118Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwojh30zt 2022-11-23T02:38:50.8138392Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwojh30zt/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8138647Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjqbv8h5c 2022-11-23T02:38:50.8138917Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjqbv8h5c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8139140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8139357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8139603Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.8139843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8140250Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.8140640Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.8140734Z ok (4.975s) 2022-11-23T02:38:50.8140754Z 2022-11-23T02:38:50.8141009Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8141114Z Ran 1 test in 4.975s 2022-11-23T02:38:50.8141134Z 2022-11-23T02:38:50.8141214Z OK 2022-11-23T02:38:50.8141239Z 2022-11-23T02:38:50.8141348Z Generating XML reports... 2022-11-23T02:38:50.8141806Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023136.xml 2022-11-23T02:38:50.8142172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8142343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8142721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8142907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8143163Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppi21roo6 2022-11-23T02:38:50.8143427Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppi21roo6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8143500Z 2022-11-23T02:38:50.8143598Z Running tests... 2022-11-23T02:38:50.8143865Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8144171Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8144442Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8144660Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73535 2022-11-23T02:38:50.8144877Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73536 2022-11-23T02:38:50.8145243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8145410Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8145779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8145967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8146326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8146493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8146916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8147107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8147355Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj6ulfsq7 2022-11-23T02:38:50.8147615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj6ulfsq7/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8147834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8148368Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8148924Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8149453Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8149987Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8150518Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8151048Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8151351Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmperid93qb 2022-11-23T02:38:50.8151616Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmperid93qb/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8151842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8152373Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8152946Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8153519Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8154054Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8154584Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8155331Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:38:50.8155442Z ok (3.920s) 2022-11-23T02:38:50.8155463Z 2022-11-23T02:38:50.8155738Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8155850Z Ran 1 test in 3.920s 2022-11-23T02:38:50.8155871Z 2022-11-23T02:38:50.8155958Z OK 2022-11-23T02:38:50.8155977Z 2022-11-23T02:38:50.8156099Z Generating XML reports... 2022-11-23T02:38:50.8156567Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023143.xml 2022-11-23T02:38:50.8156933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8157108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8157490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8157684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8157936Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9yhxlur_ 2022-11-23T02:38:50.8158204Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9yhxlur_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8158225Z 2022-11-23T02:38:50.8158328Z Running tests... 2022-11-23T02:38:50.8158591Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8158983Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8159254Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8159471Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73638 2022-11-23T02:38:50.8159688Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73639 2022-11-23T02:38:50.8160062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8160234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8160619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8160804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8161171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8161336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8161709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8161958Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8162217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2df79bwn 2022-11-23T02:38:50.8162481Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2df79bwn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8162731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw7qa21hl 2022-11-23T02:38:50.8162955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8163222Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw7qa21hl/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8163440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8163681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.8163921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8164328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.8164722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:38:50.8164953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8165186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8165413Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8165643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8165729Z ok (5.374s) 2022-11-23T02:38:50.8165750Z 2022-11-23T02:38:50.8166011Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8166122Z Ran 1 test in 5.374s 2022-11-23T02:38:50.8166141Z 2022-11-23T02:38:50.8166230Z OK 2022-11-23T02:38:50.8166250Z 2022-11-23T02:38:50.8166372Z Generating XML reports... 2022-11-23T02:38:50.8166826Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023149.xml 2022-11-23T02:38:50.8167195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8167369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8167747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8167989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8168245Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbe9v7vmq 2022-11-23T02:38:50.8168511Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbe9v7vmq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8168535Z 2022-11-23T02:38:50.8168642Z Running tests... 2022-11-23T02:38:50.8168905Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8169217Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8169478Z test_sparse_gradients (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8169700Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73753 2022-11-23T02:38:50.8169903Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73754 2022-11-23T02:38:50.8170281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8170450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8170825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8171071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8171446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8171621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8171990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8172174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8172423Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzwvr746_ 2022-11-23T02:38:50.8172688Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzwvr746_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8172945Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjnlvh4il 2022-11-23T02:38:50.8173214Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjnlvh4il/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8173439Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8173665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8173759Z ok (3.949s) 2022-11-23T02:38:50.8173779Z 2022-11-23T02:38:50.8174039Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8174136Z Ran 1 test in 3.949s 2022-11-23T02:38:50.8174155Z 2022-11-23T02:38:50.8174246Z OK 2022-11-23T02:38:50.8174269Z 2022-11-23T02:38:50.8174389Z Generating XML reports... 2022-11-23T02:38:50.8174850Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023157.xml 2022-11-23T02:38:50.8175225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8175405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8175781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8175966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8176211Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdxkqymo5 2022-11-23T02:38:50.8176466Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdxkqymo5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8176487Z 2022-11-23T02:38:50.8176644Z Running tests... 2022-11-23T02:38:50.8176904Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8177207Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8177487Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8177708Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73896 2022-11-23T02:38:50.8177921Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73897 2022-11-23T02:38:50.8178293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8178454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8178830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8179021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8179385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8179553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8179978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8180171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8180425Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy0s0usxy 2022-11-23T02:38:50.8180689Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy0s0usxy/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8180902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8181158Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl23l7h8s 2022-11-23T02:38:50.8181424Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl23l7h8s/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8181648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8181743Z ok (4.088s) 2022-11-23T02:38:50.8181764Z 2022-11-23T02:38:50.8182034Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8182142Z Ran 1 test in 4.088s 2022-11-23T02:38:50.8182162Z 2022-11-23T02:38:50.8182251Z OK 2022-11-23T02:38:50.8182270Z 2022-11-23T02:38:50.8182378Z Generating XML reports... 2022-11-23T02:38:50.8182834Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023203.xml 2022-11-23T02:38:50.8183208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8183379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8183762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8183947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8184200Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3qq9p0mg 2022-11-23T02:38:50.8184463Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3qq9p0mg/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8184483Z 2022-11-23T02:38:50.8184589Z Running tests... 2022-11-23T02:38:50.8184843Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8185152Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8185430Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8185642Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74039 2022-11-23T02:38:50.8185916Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74040 2022-11-23T02:38:50.8186288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8186464Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8186845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8187022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8187390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8187555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8187931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8188119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8188375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2nzra2sw 2022-11-23T02:38:50.8188642Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2nzra2sw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8188938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpokyz2z6c 2022-11-23T02:38:50.8189209Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpokyz2z6c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8189424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8189648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8189885Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8190119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8190219Z ok (5.773s) 2022-11-23T02:38:50.8190239Z 2022-11-23T02:38:50.8190507Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8190615Z Ran 1 test in 5.774s 2022-11-23T02:38:50.8190635Z 2022-11-23T02:38:50.8190725Z OK 2022-11-23T02:38:50.8190744Z 2022-11-23T02:38:50.8190852Z Generating XML reports... 2022-11-23T02:38:50.8191313Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023210.xml 2022-11-23T02:38:50.8191683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8191853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8192232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8192418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8192677Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoj5__dpx 2022-11-23T02:38:50.8192940Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoj5__dpx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8192960Z 2022-11-23T02:38:50.8193067Z Running tests... 2022-11-23T02:38:50.8193324Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8193630Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8193913Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8194125Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74154 2022-11-23T02:38:50.8194347Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74155 2022-11-23T02:38:50.8194720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8194948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8195562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8195742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8196112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8196282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8196657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8196837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8197084Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp73xmyoa7 2022-11-23T02:38:50.8197352Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp73xmyoa7/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8197584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8197835Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2k2boxg4 2022-11-23T02:38:50.8198163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2k2boxg4/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8198395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8198623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8198850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:38:50.8198945Z ok (5.477s) 2022-11-23T02:38:50.8198965Z 2022-11-23T02:38:50.8199231Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8199343Z Ran 1 test in 5.478s 2022-11-23T02:38:50.8199363Z 2022-11-23T02:38:50.8199454Z OK 2022-11-23T02:38:50.8199473Z 2022-11-23T02:38:50.8199581Z Generating XML reports... 2022-11-23T02:38:50.8200039Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023218.xml 2022-11-23T02:38:50.8200418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8200589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8200968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8201154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8201405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9s5wvbdc 2022-11-23T02:38:50.8201676Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9s5wvbdc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8201696Z 2022-11-23T02:38:50.8201805Z Running tests... 2022-11-23T02:38:50.8202056Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8202365Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8202705Z test_allgather_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8202920Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74269 2022-11-23T02:38:50.8203291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8203462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8203841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8204107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8204350Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2bmorpsh 2022-11-23T02:38:50.8204613Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2bmorpsh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8204837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8205076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8205473Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:38:50.8206218Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8206328Z warnings.warn( 2022-11-23T02:38:50.8206419Z ok (3.875s) 2022-11-23T02:38:50.8206439Z 2022-11-23T02:38:50.8206695Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8206792Z Ran 1 test in 3.875s 2022-11-23T02:38:50.8206818Z 2022-11-23T02:38:50.8206895Z OK 2022-11-23T02:38:50.8206914Z 2022-11-23T02:38:50.8207077Z Generating XML reports... 2022-11-23T02:38:50.8207633Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023226.xml 2022-11-23T02:38:50.8207996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8208165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8208540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8208730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8208981Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpug42g4lo 2022-11-23T02:38:50.8209235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpug42g4lo/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8209255Z 2022-11-23T02:38:50.8209368Z Running tests... 2022-11-23T02:38:50.8209630Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8209945Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8210281Z test_allreduce_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8210493Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74341 2022-11-23T02:38:50.8210867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8211040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8211412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8211596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8211847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp48ki7j4o 2022-11-23T02:38:50.8212110Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp48ki7j4o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8212330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8212568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8212963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:38:50.8213760Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8213864Z warnings.warn( 2022-11-23T02:38:50.8213955Z ok (3.818s) 2022-11-23T02:38:50.8213979Z 2022-11-23T02:38:50.8214229Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8214335Z Ran 1 test in 3.818s 2022-11-23T02:38:50.8214354Z 2022-11-23T02:38:50.8214438Z OK 2022-11-23T02:38:50.8214458Z 2022-11-23T02:38:50.8214572Z Generating XML reports... 2022-11-23T02:38:50.8215117Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023232.xml 2022-11-23T02:38:50.8215482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8215656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8216032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8216211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8216512Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqy4g16n2 2022-11-23T02:38:50.8216784Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqy4g16n2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8216804Z 2022-11-23T02:38:50.8216906Z Running tests... 2022-11-23T02:38:50.8217172Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8217486Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8217806Z test_collectives (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8218028Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74413 2022-11-23T02:38:50.8218397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8218559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8218943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8219127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8219379Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsjanl5zu 2022-11-23T02:38:50.8219641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsjanl5zu/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8219868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8220110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8220516Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:38:50.8220603Z ok (3.849s) 2022-11-23T02:38:50.8220636Z 2022-11-23T02:38:50.8220890Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8220995Z Ran 1 test in 3.849s 2022-11-23T02:38:50.8221015Z 2022-11-23T02:38:50.8221101Z OK 2022-11-23T02:38:50.8221121Z 2022-11-23T02:38:50.8221247Z Generating XML reports... 2022-11-23T02:38:50.8221799Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023238.xml 2022-11-23T02:38:50.8222168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8222342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8222779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8222957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8223211Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxrd1n38v 2022-11-23T02:38:50.8223476Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxrd1n38v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8223497Z 2022-11-23T02:38:50.8223604Z Running tests... 2022-11-23T02:38:50.8223863Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8224173Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8224498Z test_monitored_barrier (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8224721Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74485 2022-11-23T02:38:50.8225094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8225258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8225682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8225883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8226140Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpihvp6qek 2022-11-23T02:38:50.8226414Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpihvp6qek/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8226646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8226893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8227308Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:38:50.8227395Z ok (3.925s) 2022-11-23T02:38:50.8227416Z 2022-11-23T02:38:50.8227681Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8227794Z Ran 1 test in 3.926s 2022-11-23T02:38:50.8227818Z 2022-11-23T02:38:50.8227914Z OK 2022-11-23T02:38:50.8227934Z 2022-11-23T02:38:50.8228059Z Generating XML reports... 2022-11-23T02:38:50.8228616Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023244.xml 2022-11-23T02:38:50.8228994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8229175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8229563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8229746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8230007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps2hwowks 2022-11-23T02:38:50.8230284Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps2hwowks/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8230304Z 2022-11-23T02:38:50.8230415Z Running tests... 2022-11-23T02:38:50.8230684Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8230999Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8231254Z test_allgather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8231478Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74557 2022-11-23T02:38:50.8231744Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74558 2022-11-23T02:38:50.8231961Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74559 2022-11-23T02:38:50.8232177Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74560 2022-11-23T02:38:50.8232561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8232741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8233127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8233323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8233695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8233874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8234240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8234435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8234803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8235239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8235649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8235839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8236211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8236388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8236755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8236956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8237219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzerv8r7b 2022-11-23T02:38:50.8237499Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzerv8r7b/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8237732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8237991Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps7xkrpco 2022-11-23T02:38:50.8238267Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps7xkrpco/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8238523Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfjk21u4x 2022-11-23T02:38:50.8238791Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfjk21u4x/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8239005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8239238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8239493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcrmncx68 2022-11-23T02:38:50.8239765Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcrmncx68/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8239994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8240100Z ok (4.165s) 2022-11-23T02:38:50.8240121Z 2022-11-23T02:38:50.8240396Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8240513Z Ran 1 test in 4.165s 2022-11-23T02:38:50.8240532Z 2022-11-23T02:38:50.8240609Z OK 2022-11-23T02:38:50.8240627Z 2022-11-23T02:38:50.8240752Z Generating XML reports... 2022-11-23T02:38:50.8241293Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023250.xml 2022-11-23T02:38:50.8241670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8241849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8242239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8242436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8242695Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpig7lksus 2022-11-23T02:38:50.8242967Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpig7lksus/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8242988Z 2022-11-23T02:38:50.8243082Z Running tests... 2022-11-23T02:38:50.8243352Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8243673Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8243936Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8244157Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74740 2022-11-23T02:38:50.8244486Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74741 2022-11-23T02:38:50.8244714Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74742 2022-11-23T02:38:50.8244928Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74743 2022-11-23T02:38:50.8245290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8245465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8245848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8246046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8246419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8246600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8246979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8247170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8247546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8247704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8248080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8248274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8248646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8248823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8249200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8249391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8249650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyu4_2une 2022-11-23T02:38:50.8249907Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyu4_2une/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8250139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8250453Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpes00lcpv 2022-11-23T02:38:50.8250726Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpes00lcpv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8250957Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8251218Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp76aqkqv6 2022-11-23T02:38:50.8251487Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp76aqkqv6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8251745Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0opzdkxc 2022-11-23T02:38:50.8252016Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0opzdkxc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8252230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8252462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8252612Z ok (5.257s) 2022-11-23T02:38:50.8252633Z 2022-11-23T02:38:50.8252906Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8253020Z Ran 1 test in 5.258s 2022-11-23T02:38:50.8253040Z 2022-11-23T02:38:50.8253135Z OK 2022-11-23T02:38:50.8253154Z 2022-11-23T02:38:50.8253328Z Generating XML reports... 2022-11-23T02:38:50.8253773Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023257.xml 2022-11-23T02:38:50.8254133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8254312Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8254695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8254893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8255150Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpurcr1cbb 2022-11-23T02:38:50.8255423Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpurcr1cbb/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8255443Z 2022-11-23T02:38:50.8255554Z Running tests... 2022-11-23T02:38:50.8255825Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8256140Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8256376Z test_allgather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8256596Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74927 2022-11-23T02:38:50.8256817Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74928 2022-11-23T02:38:50.8257034Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 74929 2022-11-23T02:38:50.8257252Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 74930 2022-11-23T02:38:50.8257629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8257807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8258193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8258371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8258738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8258914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8259301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8259564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8259935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8260111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8260492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8260666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8261037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8261210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8261589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8261780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8262048Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk5hai5v2 2022-11-23T02:38:50.8262321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk5hai5v2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8262624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7tsypeok 2022-11-23T02:38:50.8262902Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7tsypeok/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8263116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8263340Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8263599Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_do_spl1 2022-11-23T02:38:50.8263870Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_do_spl1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8264108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8264366Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnxoitin9 2022-11-23T02:38:50.8264639Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnxoitin9/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8264870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8264958Z ok (4.121s) 2022-11-23T02:38:50.8264996Z 2022-11-23T02:38:50.8265252Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8265365Z Ran 1 test in 4.121s 2022-11-23T02:38:50.8265385Z 2022-11-23T02:38:50.8265479Z OK 2022-11-23T02:38:50.8265499Z 2022-11-23T02:38:50.8265623Z Generating XML reports... 2022-11-23T02:38:50.8266058Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023305.xml 2022-11-23T02:38:50.8266438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8266620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8267009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8267188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8267449Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpameagq9c 2022-11-23T02:38:50.8267721Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpameagq9c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8267741Z 2022-11-23T02:38:50.8267852Z Running tests... 2022-11-23T02:38:50.8268119Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8268437Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8268762Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8268986Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75110 2022-11-23T02:38:50.8269187Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75111 2022-11-23T02:38:50.8269410Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75112 2022-11-23T02:38:50.8269630Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75113 2022-11-23T02:38:50.8270010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8270190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8270574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8270772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8271143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8271322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8271724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8271928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8272296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8272472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8272846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8273036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8273418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8273596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8273953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8274148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8274408Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsr3mus0k 2022-11-23T02:38:50.8274679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsr3mus0k/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8274937Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpye66qe7w 2022-11-23T02:38:50.8275395Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpye66qe7w/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8275636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8275870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8276127Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnw44331n 2022-11-23T02:38:50.8276380Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnw44331n/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8276612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8276870Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx786ohr9 2022-11-23T02:38:50.8277141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx786ohr9/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8277371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8277619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8277954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.8278201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:38:50.8278623Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8278850Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:38:50.8279245Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8279644Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8280038Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8280790Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8280906Z warnings.warn( 2022-11-23T02:38:50.8281718Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8281840Z warnings.warn( 2022-11-23T02:38:50.8282580Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8282699Z warnings.warn( 2022-11-23T02:38:50.8283441Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8283543Z warnings.warn( 2022-11-23T02:38:50.8283646Z ok (4.165s) 2022-11-23T02:38:50.8283667Z 2022-11-23T02:38:50.8283938Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8284050Z Ran 1 test in 4.165s 2022-11-23T02:38:50.8284070Z 2022-11-23T02:38:50.8284164Z OK 2022-11-23T02:38:50.8284183Z 2022-11-23T02:38:50.8284311Z Generating XML reports... 2022-11-23T02:38:50.8284750Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023311.xml 2022-11-23T02:38:50.8285133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8285297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8285681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8285881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8286142Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0d7pn15a 2022-11-23T02:38:50.8286416Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0d7pn15a/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8286436Z 2022-11-23T02:38:50.8286549Z Running tests... 2022-11-23T02:38:50.8286818Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8287135Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8287463Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8287670Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75293 2022-11-23T02:38:50.8287891Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75294 2022-11-23T02:38:50.8288116Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75295 2022-11-23T02:38:50.8288335Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75296 2022-11-23T02:38:50.8288716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8288896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8289283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8289478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8289836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8290014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8290393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8290632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8291009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8291185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8291571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8291763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8292135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8292293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8292676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8292872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8293135Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm4f4zvos 2022-11-23T02:38:50.8293410Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm4f4zvos/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8293644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8293900Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi2dkhwrf 2022-11-23T02:38:50.8294174Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi2dkhwrf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8294415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7wk2uctv 2022-11-23T02:38:50.8294682Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7wk2uctv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8294942Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0rdxl1c1 2022-11-23T02:38:50.8295213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0rdxl1c1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8295445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8295670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8295898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8296644Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8296820Z warnings.warn( 2022-11-23T02:38:50.8297568Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8297665Z warnings.warn( 2022-11-23T02:38:50.8298399Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8298511Z warnings.warn( 2022-11-23T02:38:50.8299245Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8299357Z warnings.warn( 2022-11-23T02:38:50.8299463Z ok (4.121s) 2022-11-23T02:38:50.8299483Z 2022-11-23T02:38:50.8299811Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8299938Z Ran 1 test in 4.122s 2022-11-23T02:38:50.8299957Z 2022-11-23T02:38:50.8300052Z OK 2022-11-23T02:38:50.8300071Z 2022-11-23T02:38:50.8300180Z Generating XML reports... 2022-11-23T02:38:50.8300622Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023318.xml 2022-11-23T02:38:50.8301003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8301189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8301579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8301776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8302038Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5zbo782m 2022-11-23T02:38:50.8302309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5zbo782m/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8302329Z 2022-11-23T02:38:50.8302439Z Running tests... 2022-11-23T02:38:50.8302688Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8303005Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8303277Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8303503Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75476 2022-11-23T02:38:50.8303724Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75477 2022-11-23T02:38:50.8303942Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75478 2022-11-23T02:38:50.8304160Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75479 2022-11-23T02:38:50.8304543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8304704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8305091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8305286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8305654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8305891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8306269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8306460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8306831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8307009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8307368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8307559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8307929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8308109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8308492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8308681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8308983Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9vlh32kq 2022-11-23T02:38:50.8309264Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9vlh32kq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8309503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv0sunj7l 2022-11-23T02:38:50.8309761Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpieg5qjv5 2022-11-23T02:38:50.8310032Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv0sunj7l/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8310296Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpieg5qjv5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8310534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8310770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8310997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8311258Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbr8kd_mw 2022-11-23T02:38:50.8311523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbr8kd_mw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8311734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8311839Z ok (4.163s) 2022-11-23T02:38:50.8311859Z 2022-11-23T02:38:50.8312133Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8312248Z Ran 1 test in 4.163s 2022-11-23T02:38:50.8312272Z 2022-11-23T02:38:50.8312367Z OK 2022-11-23T02:38:50.8312386Z 2022-11-23T02:38:50.8312513Z Generating XML reports... 2022-11-23T02:38:50.8312952Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023324.xml 2022-11-23T02:38:50.8313329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8313495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8313883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8314076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8314333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7_tl65lw 2022-11-23T02:38:50.8314603Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7_tl65lw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8314674Z 2022-11-23T02:38:50.8314789Z Running tests... 2022-11-23T02:38:50.8315274Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8315602Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8315856Z test_allgather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8316064Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75659 2022-11-23T02:38:50.8316284Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75660 2022-11-23T02:38:50.8316502Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75661 2022-11-23T02:38:50.8316719Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75662 2022-11-23T02:38:50.8317101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8317282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8317658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8317835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8318279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8318486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8318873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8319064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8319435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8319611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8320001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8320197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8320568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8320734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8321115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8321305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8321565Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfr5w6x66 2022-11-23T02:38:50.8321834Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfr5w6x66/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8322071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8322331Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9zg9tbk6 2022-11-23T02:38:50.8322602Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9zg9tbk6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8322844Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1vnir_hc 2022-11-23T02:38:50.8323112Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1vnir_hc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8323341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8323572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8323822Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc_khcdls 2022-11-23T02:38:50.8324091Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc_khcdls/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8324393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8324497Z ok (4.656s) 2022-11-23T02:38:50.8324518Z 2022-11-23T02:38:50.8324791Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8324889Z Ran 1 test in 4.656s 2022-11-23T02:38:50.8324912Z 2022-11-23T02:38:50.8325008Z OK 2022-11-23T02:38:50.8325027Z 2022-11-23T02:38:50.8325152Z Generating XML reports... 2022-11-23T02:38:50.8325591Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023331.xml 2022-11-23T02:38:50.8325967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8326145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8326531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8326729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8326968Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr2e5aikf 2022-11-23T02:38:50.8327237Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr2e5aikf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8327303Z 2022-11-23T02:38:50.8327423Z Running tests... 2022-11-23T02:38:50.8327690Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8328010Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8328269Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8328488Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75866 2022-11-23T02:38:50.8328705Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75867 2022-11-23T02:38:50.8328912Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75868 2022-11-23T02:38:50.8329128Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75869 2022-11-23T02:38:50.8329509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8329694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8330082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8330276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8330648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8330826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8331205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8331383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8331745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8331927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8332306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8332498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8332873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8333050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8333425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8333662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8333922Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgzpxpcr0 2022-11-23T02:38:50.8334196Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgzpxpcr0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8334459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphnounbps 2022-11-23T02:38:50.8334733Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphnounbps/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8334964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8335197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8335453Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm79e1z70 2022-11-23T02:38:50.8335725Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm79e1z70/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8335937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8336193Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8r41cikw 2022-11-23T02:38:50.8336509Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8r41cikw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8336745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8336849Z ok (6.980s) 2022-11-23T02:38:50.8336869Z 2022-11-23T02:38:50.8337145Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8337261Z Ran 1 test in 6.981s 2022-11-23T02:38:50.8337280Z 2022-11-23T02:38:50.8337376Z OK 2022-11-23T02:38:50.8337394Z 2022-11-23T02:38:50.8337502Z Generating XML reports... 2022-11-23T02:38:50.8337946Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023338.xml 2022-11-23T02:38:50.8338320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8338501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8338892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8339088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8339342Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjfv_jzoo 2022-11-23T02:38:50.8339612Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjfv_jzoo/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8339632Z 2022-11-23T02:38:50.8339745Z Running tests... 2022-11-23T02:38:50.8339996Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8340312Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8340564Z test_allreduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8340788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76077 2022-11-23T02:38:50.8341012Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76078 2022-11-23T02:38:50.8341231Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76079 2022-11-23T02:38:50.8341448Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76080 2022-11-23T02:38:50.8341825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8341988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8342375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8342627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8342999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8343175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8343552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8343745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8344112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8344293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8344653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8344846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8345212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8345388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8345813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8346010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8346271Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyrqvnt5o 2022-11-23T02:38:50.8346546Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyrqvnt5o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8346762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8347021Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp27hc8d1h 2022-11-23T02:38:50.8347298Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp27hc8d1h/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8347552Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd2qw0jjw 2022-11-23T02:38:50.8347826Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd2qw0jjw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8348058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8348284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8348537Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqsl0dua3 2022-11-23T02:38:50.8348807Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqsl0dua3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8349019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8349125Z ok (4.252s) 2022-11-23T02:38:50.8349146Z 2022-11-23T02:38:50.8349420Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8349538Z Ran 1 test in 4.252s 2022-11-23T02:38:50.8349558Z 2022-11-23T02:38:50.8349652Z OK 2022-11-23T02:38:50.8349672Z 2022-11-23T02:38:50.8349799Z Generating XML reports... 2022-11-23T02:38:50.8350243Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023347.xml 2022-11-23T02:38:50.8350622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8350784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8351174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8351372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8351685Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9g1w00r0 2022-11-23T02:38:50.8351954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9g1w00r0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8351974Z 2022-11-23T02:38:50.8352087Z Running tests... 2022-11-23T02:38:50.8352362Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8352720Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8352982Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8353185Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76260 2022-11-23T02:38:50.8353407Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76261 2022-11-23T02:38:50.8353628Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76262 2022-11-23T02:38:50.8353848Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76263 2022-11-23T02:38:50.8354227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8354408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8354841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8355256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8355626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8355805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8356188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8356384Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8356750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8356929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8357317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8357510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8357876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8358036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8358413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8358608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8358869Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj2a4x33e 2022-11-23T02:38:50.8359141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj2a4x33e/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8359372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8359635Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyp119ihq 2022-11-23T02:38:50.8359910Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyp119ihq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8360148Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk5gdoc2i 2022-11-23T02:38:50.8360417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk5gdoc2i/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8360645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8360960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8361214Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyet94np8 2022-11-23T02:38:50.8361484Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyet94np8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8361717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8361822Z ok (5.122s) 2022-11-23T02:38:50.8361843Z 2022-11-23T02:38:50.8362116Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8362212Z Ran 1 test in 5.122s 2022-11-23T02:38:50.8362232Z 2022-11-23T02:38:50.8362327Z OK 2022-11-23T02:38:50.8362347Z 2022-11-23T02:38:50.8362471Z Generating XML reports... 2022-11-23T02:38:50.8362911Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023354.xml 2022-11-23T02:38:50.8363293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8363473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8363860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8364125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8364376Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxnf79wda 2022-11-23T02:38:50.8364648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxnf79wda/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8364668Z 2022-11-23T02:38:50.8364779Z Running tests... 2022-11-23T02:38:50.8365045Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8365365Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8365655Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8365879Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76447 2022-11-23T02:38:50.8366100Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76448 2022-11-23T02:38:50.8366323Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76449 2022-11-23T02:38:50.8366525Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76450 2022-11-23T02:38:50.8366903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8367082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8367467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8367661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8368034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8368214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8368598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8368776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8369147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8369323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8369706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8369898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8370320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8370496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8370875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8371072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8371318Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9qyprbj6 2022-11-23T02:38:50.8371577Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqj5zgqjt 2022-11-23T02:38:50.8371849Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9qyprbj6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8372115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqj5zgqjt/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8372374Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd6x28iit 2022-11-23T02:38:50.8372648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd6x28iit/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8372882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8373164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8373385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8373636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4sul8h1e 2022-11-23T02:38:50.8373905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4sul8h1e/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8374131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8374237Z ok (5.173s) 2022-11-23T02:38:50.8374260Z 2022-11-23T02:38:50.8374532Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8374646Z Ran 1 test in 5.173s 2022-11-23T02:38:50.8374665Z 2022-11-23T02:38:50.8374762Z OK 2022-11-23T02:38:50.8374781Z 2022-11-23T02:38:50.8374907Z Generating XML reports... 2022-11-23T02:38:50.8375334Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023401.xml 2022-11-23T02:38:50.8375712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8375890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8376273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8376465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8376722Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmput48oayc 2022-11-23T02:38:50.8376998Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmput48oayc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8377018Z 2022-11-23T02:38:50.8377131Z Running tests... 2022-11-23T02:38:50.8377381Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8377705Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8377979Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8378204Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76634 2022-11-23T02:38:50.8378424Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76635 2022-11-23T02:38:50.8378642Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76636 2022-11-23T02:38:50.8378856Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76637 2022-11-23T02:38:50.8379296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8379475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8379844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8380045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8380416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8380593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8380973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8381167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8381537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8381713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8382072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8382312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8382693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8382867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8383240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8383429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8383692Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppefu4b_y 2022-11-23T02:38:50.8383974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppefu4b_y/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8384210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8384449Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp01mlg_cw 2022-11-23T02:38:50.8384718Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp01mlg_cw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8384952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8385211Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr5k4lum4 2022-11-23T02:38:50.8385480Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr5k4lum4/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8385709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8385971Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9i95u4cz 2022-11-23T02:38:50.8386243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9i95u4cz/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8386455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8386561Z ok (4.169s) 2022-11-23T02:38:50.8386585Z 2022-11-23T02:38:50.8386854Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8386968Z Ran 1 test in 4.169s 2022-11-23T02:38:50.8386988Z 2022-11-23T02:38:50.8387084Z OK 2022-11-23T02:38:50.8387103Z 2022-11-23T02:38:50.8387228Z Generating XML reports... 2022-11-23T02:38:50.8387668Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023409.xml 2022-11-23T02:38:50.8388045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8388336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8388710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8388903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8389164Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbluzrz0m 2022-11-23T02:38:50.8389439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbluzrz0m/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8389459Z 2022-11-23T02:38:50.8389573Z Running tests... 2022-11-23T02:38:50.8389841Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8390158Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8390413Z test_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8390622Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76817 2022-11-23T02:38:50.8390843Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76818 2022-11-23T02:38:50.8391061Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76819 2022-11-23T02:38:50.8391322Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76820 2022-11-23T02:38:50.8391708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8391888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8392277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8392470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8392840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8393003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8393384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8393578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8393953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8394132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8394507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8394700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8395274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8395449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8395833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8396021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8396284Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp73y5l939 2022-11-23T02:38:50.8396557Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp73y5l939/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8396789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8397046Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkq06z5iu 2022-11-23T02:38:50.8397318Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkq06z5iu/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8397575Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzu8cqxxa 2022-11-23T02:38:50.8397915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzu8cqxxa/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8398144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8398370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8398631Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl7oyn5lf 2022-11-23T02:38:50.8398901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl7oyn5lf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8399131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8399233Z ok (4.125s) 2022-11-23T02:38:50.8399253Z 2022-11-23T02:38:50.8399525Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8399626Z Ran 1 test in 4.125s 2022-11-23T02:38:50.8399645Z 2022-11-23T02:38:50.8399742Z OK 2022-11-23T02:38:50.8399761Z 2022-11-23T02:38:50.8399889Z Generating XML reports... 2022-11-23T02:38:50.8400326Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023415.xml 2022-11-23T02:38:50.8400760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8400947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8401332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8401527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8401785Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpedwhli8h 2022-11-23T02:38:50.8402039Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpedwhli8h/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8402062Z 2022-11-23T02:38:50.8402176Z Running tests... 2022-11-23T02:38:50.8402447Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8402763Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8403036Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8403261Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77000 2022-11-23T02:38:50.8403482Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77001 2022-11-23T02:38:50.8403701Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77002 2022-11-23T02:38:50.8403902Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77003 2022-11-23T02:38:50.8404283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8404467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8404857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8405054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8405430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8405609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8405989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8406181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8406532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8406770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8407157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8407348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8407723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8407900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8408279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8408472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8408718Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdqotg2b2 2022-11-23T02:38:50.8408991Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdqotg2b2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8409234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8409493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd9ig42vm 2022-11-23T02:38:50.8409763Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd9ig42vm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8410067Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdhphkfrs 2022-11-23T02:38:50.8410349Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdhphkfrs/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8410605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzbhtgqxd 2022-11-23T02:38:50.8410876Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzbhtgqxd/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8411089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8411326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8411557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8411803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8412056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:38:50.8412303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:38:50.8412547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:38:50.8412960Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8413363Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8413746Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8414145Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:38:50.8414899Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8415021Z warnings.warn( 2022-11-23T02:38:50.8415760Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8415929Z warnings.warn( 2022-11-23T02:38:50.8416672Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8416784Z warnings.warn( 2022-11-23T02:38:50.8417589Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:38:50.8417705Z warnings.warn( 2022-11-23T02:38:50.8417791Z ok (4.092s) 2022-11-23T02:38:50.8417829Z 2022-11-23T02:38:50.8418082Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8418197Z Ran 1 test in 4.092s 2022-11-23T02:38:50.8418220Z 2022-11-23T02:38:50.8418316Z OK 2022-11-23T02:38:50.8418335Z 2022-11-23T02:38:50.8418462Z Generating XML reports... 2022-11-23T02:38:50.8418899Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023422.xml 2022-11-23T02:38:50.8419278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8419513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8419909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8420088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8420348Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn3q9cac1 2022-11-23T02:38:50.8420623Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn3q9cac1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8420647Z 2022-11-23T02:38:50.8420760Z Running tests... 2022-11-23T02:38:50.8421030Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8421345Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8421616Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8421843Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77183 2022-11-23T02:38:50.8422050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77184 2022-11-23T02:38:50.8422270Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77185 2022-11-23T02:38:50.8422484Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77186 2022-11-23T02:38:50.8422867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8423051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8423440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8423635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8424012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8424191Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8424554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8424751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8425120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8425298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8425734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8425927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8426303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8426484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8426840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8427032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8427294Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprawj8jdo 2022-11-23T02:38:50.8427570Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprawj8jdo/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8427807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8428065Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgdt4yhyt 2022-11-23T02:38:50.8428340Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgdt4yhyt/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8428660Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptlyaqtfp 2022-11-23T02:38:50.8428942Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptlyaqtfp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8429181Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkd73gkmn 2022-11-23T02:38:50.8429450Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkd73gkmn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8429683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8429913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8430146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8430252Z ok (4.158s) 2022-11-23T02:38:50.8430272Z 2022-11-23T02:38:50.8430544Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8430661Z Ran 1 test in 4.158s 2022-11-23T02:38:50.8430685Z 2022-11-23T02:38:50.8430763Z OK 2022-11-23T02:38:50.8430802Z 2022-11-23T02:38:50.8430910Z Generating XML reports... 2022-11-23T02:38:50.8431348Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023428.xml 2022-11-23T02:38:50.8431723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8431903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8432289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8432490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8432752Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvackqt2j 2022-11-23T02:38:50.8433027Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvackqt2j/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8433051Z 2022-11-23T02:38:50.8433146Z Running tests... 2022-11-23T02:38:50.8433414Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8433732Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8434004Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8434225Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77366 2022-11-23T02:38:50.8434446Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77367 2022-11-23T02:38:50.8434724Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77368 2022-11-23T02:38:50.8434944Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77369 2022-11-23T02:38:50.8435528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8435713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8436099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8436291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8436657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8436838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8437222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8437412Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8437787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8438019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8438406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8438597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8438968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8439147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8439522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8439718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8439981Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmvc4ntdw 2022-11-23T02:38:50.8440244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmvc4ntdw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8440505Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpah23nwez 2022-11-23T02:38:50.8440760Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeks8ffkq 2022-11-23T02:38:50.8441032Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpah23nwez/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8441298Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeks8ffkq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8441532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8441769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8441994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8442248Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4g7e4z30 2022-11-23T02:38:50.8442498Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4g7e4z30/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8442727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8442833Z ok (4.171s) 2022-11-23T02:38:50.8442853Z 2022-11-23T02:38:50.8443125Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8443242Z Ran 1 test in 4.171s 2022-11-23T02:38:50.8443261Z 2022-11-23T02:38:50.8443356Z OK 2022-11-23T02:38:50.8443375Z 2022-11-23T02:38:50.8443501Z Generating XML reports... 2022-11-23T02:38:50.8444013Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023435.xml 2022-11-23T02:38:50.8444370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8444547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8444938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8445135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8445389Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprl94ursy 2022-11-23T02:38:50.8445658Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprl94ursy/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8445678Z 2022-11-23T02:38:50.8445788Z Running tests... 2022-11-23T02:38:50.8446055Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8446375Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8446637Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8446862Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77549 2022-11-23T02:38:50.8447129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77550 2022-11-23T02:38:50.8447356Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77551 2022-11-23T02:38:50.8447575Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77552 2022-11-23T02:38:50.8447953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8448132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8448514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8448695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8449074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8449249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8449631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8449825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8450194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8450370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8450755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8450954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8451305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8451481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8451862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8452057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8452316Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3lc423hy 2022-11-23T02:38:50.8452632Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3lc423hy/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8452894Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy78wci61 2022-11-23T02:38:50.8453223Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy78wci61/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8453439Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8453669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8453929Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8kkcmjhf 2022-11-23T02:38:50.8454203Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8kkcmjhf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8454432Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8454687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_s1syd33 2022-11-23T02:38:50.8454955Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_s1syd33/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8455186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8455295Z ok (5.262s) 2022-11-23T02:38:50.8455316Z 2022-11-23T02:38:50.8455573Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8455687Z Ran 1 test in 5.262s 2022-11-23T02:38:50.8455706Z 2022-11-23T02:38:50.8455800Z OK 2022-11-23T02:38:50.8455819Z 2022-11-23T02:38:50.8455992Z Generating XML reports... 2022-11-23T02:38:50.8456443Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023441.xml 2022-11-23T02:38:50.8456818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8456997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8457382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8457558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8457824Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwg0fekhm 2022-11-23T02:38:50.8458095Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwg0fekhm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8458115Z 2022-11-23T02:38:50.8458228Z Running tests... 2022-11-23T02:38:50.8458498Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8458818Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8459088Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8459309Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77736 2022-11-23T02:38:50.8459531Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77737 2022-11-23T02:38:50.8459733Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77738 2022-11-23T02:38:50.8459956Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77739 2022-11-23T02:38:50.8460335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8460513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8460905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8461102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8461472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8461652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8462014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8462265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8462631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8462808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8463198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8463390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8463753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8463930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8464310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8464487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8464754Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppu41s0rv 2022-11-23T02:38:50.8465030Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppu41s0rv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8465336Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjnk5nklb 2022-11-23T02:38:50.8465612Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjnk5nklb/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8465868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjwxc4rcs 2022-11-23T02:38:50.8466140Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjwxc4rcs/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8466372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8466585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8466819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8467077Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpos1iygit 2022-11-23T02:38:50.8467350Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpos1iygit/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8467580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8467687Z ok (4.562s) 2022-11-23T02:38:50.8467708Z 2022-11-23T02:38:50.8467980Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8468096Z Ran 1 test in 4.562s 2022-11-23T02:38:50.8468115Z 2022-11-23T02:38:50.8468209Z OK 2022-11-23T02:38:50.8468229Z 2022-11-23T02:38:50.8468338Z Generating XML reports... 2022-11-23T02:38:50.8468777Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023449.xml 2022-11-23T02:38:50.8469156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8469335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8469721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8469921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8470179Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpft_9a61_ 2022-11-23T02:38:50.8470448Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpft_9a61_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8470468Z 2022-11-23T02:38:50.8470562Z Running tests... 2022-11-23T02:38:50.8470833Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8471151Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8471465Z test_allreduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8471688Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77943 2022-11-23T02:38:50.8471908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77944 2022-11-23T02:38:50.8472130Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77945 2022-11-23T02:38:50.8472347Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77946 2022-11-23T02:38:50.8472731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8472893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8473280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8473474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8473850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8474032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8474457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8474657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8475316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8475491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8475874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8476066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8476445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8476619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8476996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8477193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8477459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcnkxoe2v 2022-11-23T02:38:50.8477738Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcnkxoe2v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8477979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpojmbyyhj 2022-11-23T02:38:50.8478252Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpojmbyyhj/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8478494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8478726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8478982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcueo4uz6 2022-11-23T02:38:50.8479260Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcueo4uz6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8479492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8479748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsqkd3bwk 2022-11-23T02:38:50.8480000Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsqkd3bwk/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8480227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8480332Z ok (4.356s) 2022-11-23T02:38:50.8480431Z 2022-11-23T02:38:50.8480712Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8480826Z Ran 1 test in 4.356s 2022-11-23T02:38:50.8480846Z 2022-11-23T02:38:50.8480941Z OK 2022-11-23T02:38:50.8480960Z 2022-11-23T02:38:50.8481086Z Generating XML reports... 2022-11-23T02:38:50.8481528Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023456.xml 2022-11-23T02:38:50.8481905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8482068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8482453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8482647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8482904Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgrvdqq27 2022-11-23T02:38:50.8483180Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgrvdqq27/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8483200Z 2022-11-23T02:38:50.8483312Z Running tests... 2022-11-23T02:38:50.8483579Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8483955Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8484207Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8484427Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78150 2022-11-23T02:38:50.8484648Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78151 2022-11-23T02:38:50.8484866Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78152 2022-11-23T02:38:50.8485080Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78153 2022-11-23T02:38:50.8485464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8485644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8486019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8486198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8486565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8486762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8487149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8487343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8487721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8487900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8488285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8488484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8488835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8489016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8489394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8489587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8489848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsv4fz0hh 2022-11-23T02:38:50.8490183Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsv4fz0hh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8490417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8490676Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiwhp6k52 2022-11-23T02:38:50.8490952Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiwhp6k52/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8491190Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7k0zxr1p 2022-11-23T02:38:50.8491460Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7k0zxr1p/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8491717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_qcq4sfj 2022-11-23T02:38:50.8491988Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_qcq4sfj/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8492223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8492454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8492683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8492790Z ok (5.676s) 2022-11-23T02:38:50.8492867Z 2022-11-23T02:38:50.8493133Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8493247Z Ran 1 test in 5.676s 2022-11-23T02:38:50.8493266Z 2022-11-23T02:38:50.8493361Z OK 2022-11-23T02:38:50.8493381Z 2022-11-23T02:38:50.8493508Z Generating XML reports... 2022-11-23T02:38:50.8493945Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023502.xml 2022-11-23T02:38:50.8494322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8494508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8494896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8495096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8495338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5f81v_ul 2022-11-23T02:38:50.8495609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5f81v_ul/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8495629Z 2022-11-23T02:38:50.8495741Z Running tests... 2022-11-23T02:38:50.8496009Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8496325Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8496582Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8496811Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78361 2022-11-23T02:38:50.8497033Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78362 2022-11-23T02:38:50.8497233Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78363 2022-11-23T02:38:50.8497450Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78364 2022-11-23T02:38:50.8497833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8498014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8498397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8498591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8498963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8499201Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8499568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8499758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8500138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8500316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8500695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8500885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8501251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8501435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8501812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8501983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8502288Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo8sb4cs3 2022-11-23T02:38:50.8502556Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp10kaj8t5 2022-11-23T02:38:50.8502827Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo8sb4cs3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8503096Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp10kaj8t5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8503332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8503564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8503825Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp08f86ygw 2022-11-23T02:38:50.8504098Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp08f86ygw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8504317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8504576Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphqw9xu6e 2022-11-23T02:38:50.8504846Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphqw9xu6e/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8505076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8505182Z ok (4.149s) 2022-11-23T02:38:50.8505202Z 2022-11-23T02:38:50.8505477Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8505596Z Ran 1 test in 4.149s 2022-11-23T02:38:50.8505615Z 2022-11-23T02:38:50.8505711Z OK 2022-11-23T02:38:50.8505730Z 2022-11-23T02:38:50.8505838Z Generating XML reports... 2022-11-23T02:38:50.8506277Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023510.xml 2022-11-23T02:38:50.8506659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8506840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8507226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8507422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8507681Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc4duqdgc 2022-11-23T02:38:50.8507954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc4duqdgc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8508030Z 2022-11-23T02:38:50.8508148Z Running tests... 2022-11-23T02:38:50.8508399Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8508710Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8508968Z test_broadcast_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8509193Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78544 2022-11-23T02:38:50.8509415Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78545 2022-11-23T02:38:50.8509632Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78546 2022-11-23T02:38:50.8509850Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78547 2022-11-23T02:38:50.8510232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8510399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8510787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8510983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8511402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8511587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8511973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8512165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8512533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8512697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8513085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8513279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8513649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8513827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8514209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8514404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8514666Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx1fnwhzh 2022-11-23T02:38:50.8514945Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx1fnwhzh/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8515359Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_ghj8y3w 2022-11-23T02:38:50.8515635Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_ghj8y3w/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8515890Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps91c3do2 2022-11-23T02:38:50.8516165Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps91c3do2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8516399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8516630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8516887Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvpw9__ce 2022-11-23T02:38:50.8517155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvpw9__ce/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8517470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8517680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8517785Z ok (4.119s) 2022-11-23T02:38:50.8517805Z 2022-11-23T02:38:50.8518078Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8518199Z Ran 1 test in 4.119s 2022-11-23T02:38:50.8518219Z 2022-11-23T02:38:50.8518316Z OK 2022-11-23T02:38:50.8518335Z 2022-11-23T02:38:50.8518462Z Generating XML reports... 2022-11-23T02:38:50.8518905Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023517.xml 2022-11-23T02:38:50.8519287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8519448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8519836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8520031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8520290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplzytrehi 2022-11-23T02:38:50.8520628Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplzytrehi/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8520651Z 2022-11-23T02:38:50.8520769Z Running tests... 2022-11-23T02:38:50.8521033Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8521347Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8521593Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8521814Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78727 2022-11-23T02:38:50.8522038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78728 2022-11-23T02:38:50.8522257Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78729 2022-11-23T02:38:50.8522471Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78730 2022-11-23T02:38:50.8522855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8523040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8523425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8523618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8523971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8524149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8524535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8524730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8525088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8525268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8525645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8525839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8526198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8526377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8526811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8527003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8527263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph5ermxj2 2022-11-23T02:38:50.8527540Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph5ermxj2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8527773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8528030Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmcyykecm 2022-11-23T02:38:50.8528302Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmcyykecm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8528541Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq7_3an4e 2022-11-23T02:38:50.8528809Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq7_3an4e/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8529043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8529271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8529530Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0l47x3e8 2022-11-23T02:38:50.8529899Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0l47x3e8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8530139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8530245Z ok (5.188s) 2022-11-23T02:38:50.8530266Z 2022-11-23T02:38:50.8530520Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8530636Z Ran 1 test in 5.189s 2022-11-23T02:38:50.8530655Z 2022-11-23T02:38:50.8530751Z OK 2022-11-23T02:38:50.8530770Z 2022-11-23T02:38:50.8530901Z Generating XML reports... 2022-11-23T02:38:50.8531341Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023523.xml 2022-11-23T02:38:50.8531716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8531896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8532286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8532485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8532727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx3nbey9a 2022-11-23T02:38:50.8533000Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx3nbey9a/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8533020Z 2022-11-23T02:38:50.8533131Z Running tests... 2022-11-23T02:38:50.8533404Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8533724Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8533979Z test_broadcast_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8534204Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78914 2022-11-23T02:38:50.8534429Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78915 2022-11-23T02:38:50.8534631Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78916 2022-11-23T02:38:50.8534852Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78917 2022-11-23T02:38:50.8535231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8535412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8535798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8536050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8536429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8536611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8536991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8537166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8537536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8537715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8538096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8538292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8538664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8538842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8539272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8539453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8539711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpujt235h5 2022-11-23T02:38:50.8539981Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpujt235h5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8540213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8540474Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpal2iqjae 2022-11-23T02:38:50.8540747Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpal2iqjae/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8540980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8541244Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3yci1uzd 2022-11-23T02:38:50.8541517Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3yci1uzd/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8541729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8541982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpba81p1tp 2022-11-23T02:38:50.8542253Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpba81p1tp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8542487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8542594Z ok (4.182s) 2022-11-23T02:38:50.8542614Z 2022-11-23T02:38:50.8542886Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8543001Z Ran 1 test in 4.182s 2022-11-23T02:38:50.8543021Z 2022-11-23T02:38:50.8543117Z OK 2022-11-23T02:38:50.8543136Z 2022-11-23T02:38:50.8543246Z Generating XML reports... 2022-11-23T02:38:50.8543686Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023531.xml 2022-11-23T02:38:50.8544060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8544239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8544625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8544875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8545135Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcbg0g_as 2022-11-23T02:38:50.8545408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcbg0g_as/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8545428Z 2022-11-23T02:38:50.8545540Z Running tests... 2022-11-23T02:38:50.8545797Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8546116Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8546372Z test_broadcast_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8546593Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79097 2022-11-23T02:38:50.8546814Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79098 2022-11-23T02:38:50.8547032Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79099 2022-11-23T02:38:50.8547254Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79100 2022-11-23T02:38:50.8547633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8547794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8548225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8548424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8548797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8548977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8549357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8549554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8549925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8550105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8550479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8550673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8551040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8551219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8551600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8551801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8552063Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwo_68nfc 2022-11-23T02:38:50.8552336Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwo_68nfc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8552582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjpb87sp0 2022-11-23T02:38:50.8552898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjpb87sp0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8553150Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6toan0zt 2022-11-23T02:38:50.8553416Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6toan0zt/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8553646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8553875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8554192Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3lduuxci 2022-11-23T02:38:50.8554462Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3lduuxci/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8554690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8554902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8555007Z ok (4.232s) 2022-11-23T02:38:50.8555236Z 2022-11-23T02:38:50.8555522Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8555636Z Ran 1 test in 4.233s 2022-11-23T02:38:50.8555656Z 2022-11-23T02:38:50.8555755Z OK 2022-11-23T02:38:50.8555774Z 2022-11-23T02:38:50.8555901Z Generating XML reports... 2022-11-23T02:38:50.8556339Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023537.xml 2022-11-23T02:38:50.8556720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8556882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8557268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8557550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8557816Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4m7kj3s8 2022-11-23T02:38:50.8558086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4m7kj3s8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8558106Z 2022-11-23T02:38:50.8558217Z Running tests... 2022-11-23T02:38:50.8558487Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8558808Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8559079Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8559282Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79304 2022-11-23T02:38:50.8559501Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79305 2022-11-23T02:38:50.8559726Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79306 2022-11-23T02:38:50.8559944Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79307 2022-11-23T02:38:50.8560321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8560501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8560887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8561089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8561443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8561619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8562001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8562196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8562561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8562737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8563116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8563308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8563756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8563914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8564290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8564484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8564745Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1w85m_k7 2022-11-23T02:38:50.8565019Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1w85m_k7/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8565279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppzxsr5m6 2022-11-23T02:38:50.8565553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppzxsr5m6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8565793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8566014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8566269Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfr5d663y 2022-11-23T02:38:50.8566584Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfr5d663y/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8566845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqjqgnx4i 2022-11-23T02:38:50.8567115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqjqgnx4i/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8567343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8567575Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8567680Z ok (5.522s) 2022-11-23T02:38:50.8567704Z 2022-11-23T02:38:50.8567972Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8568071Z Ran 1 test in 5.522s 2022-11-23T02:38:50.8568090Z 2022-11-23T02:38:50.8568185Z OK 2022-11-23T02:38:50.8568204Z 2022-11-23T02:38:50.8568329Z Generating XML reports... 2022-11-23T02:38:50.8568771Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023544.xml 2022-11-23T02:38:50.8569150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8569333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8569717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8569914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8570154Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd25v6ub2 2022-11-23T02:38:50.8570430Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd25v6ub2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8570449Z 2022-11-23T02:38:50.8570561Z Running tests... 2022-11-23T02:38:50.8570831Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8571155Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8571400Z test_empty_tensors (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8571620Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79515 2022-11-23T02:38:50.8571844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79516 2022-11-23T02:38:50.8572044Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79517 2022-11-23T02:38:50.8572265Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79518 2022-11-23T02:38:50.8572706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8572887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8573272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8573470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8573843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8574023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8574402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8574576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8574949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8575127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8575505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8575739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8576109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8576287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8576670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8576859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8577099Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkgvajz8v 2022-11-23T02:38:50.8577379Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkgvajz8v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8577635Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprtn5tvh1 2022-11-23T02:38:50.8577905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprtn5tvh1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8578165Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbm8cuhb8 2022-11-23T02:38:50.8578439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbm8cuhb8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8578672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8578903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8579140Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp53ata6d8 2022-11-23T02:38:50.8579412Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp53ata6d8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8579645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8579872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8579979Z ok (4.061s) 2022-11-23T02:38:50.8580003Z 2022-11-23T02:38:50.8580276Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8580395Z Ran 1 test in 4.061s 2022-11-23T02:38:50.8580414Z 2022-11-23T02:38:50.8580511Z OK 2022-11-23T02:38:50.8580530Z 2022-11-23T02:38:50.8580637Z Generating XML reports... 2022-11-23T02:38:50.8581079Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023552.xml 2022-11-23T02:38:50.8581453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8581689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8582073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8582269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8582531Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpisry3fng 2022-11-23T02:38:50.8582806Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpisry3fng/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8582827Z 2022-11-23T02:38:50.8582940Z Running tests... 2022-11-23T02:38:50.8583193Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8583511Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8583755Z test_gather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8583980Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79698 2022-11-23T02:38:50.8584202Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79699 2022-11-23T02:38:50.8584421Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79700 2022-11-23T02:38:50.8584684Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79701 2022-11-23T02:38:50.8585076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8585240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8585628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8585819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8586191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8586375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8586757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8586950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8587321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8587501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8587866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8588059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8588425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8588605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8588982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8589179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8589443Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz337foo2 2022-11-23T02:38:50.8589713Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz337foo2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8589927Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8590186Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1w51vv2_ 2022-11-23T02:38:50.8590452Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1w51vv2_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8590708Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdegm9r1m 2022-11-23T02:38:50.8591034Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdegm9r1m/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8591287Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_ps46xkw 2022-11-23T02:38:50.8591561Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_ps46xkw/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8591796Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8592021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8592232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8592336Z ok (4.126s) 2022-11-23T02:38:50.8592357Z 2022-11-23T02:38:50.8592631Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8592751Z Ran 1 test in 4.126s 2022-11-23T02:38:50.8592770Z 2022-11-23T02:38:50.8592865Z OK 2022-11-23T02:38:50.8592884Z 2022-11-23T02:38:50.8593011Z Generating XML reports... 2022-11-23T02:38:50.8593450Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023558.xml 2022-11-23T02:38:50.8593874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8594047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8594435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8594630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8594889Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoqop_t9d 2022-11-23T02:38:50.8595381Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoqop_t9d/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8595408Z 2022-11-23T02:38:50.8595521Z Running tests... 2022-11-23T02:38:50.8595787Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8596105Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8596366Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8596573Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79881 2022-11-23T02:38:50.8596792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79882 2022-11-23T02:38:50.8597010Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79883 2022-11-23T02:38:50.8597223Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79884 2022-11-23T02:38:50.8597602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8597787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8598175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8598371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8598731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8598911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8599291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8599483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8599851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8600113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8600494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8600685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8601065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8601224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8601600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8601789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8602048Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq_6lv15_ 2022-11-23T02:38:50.8602322Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq_6lv15_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8602584Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1cavaz68 2022-11-23T02:38:50.8602859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1cavaz68/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8603093Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8603390Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdnb7biek 2022-11-23T02:38:50.8603672Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdnb7biek/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8603902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8604130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8604387Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpovbwox6o 2022-11-23T02:38:50.8604664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpovbwox6o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8604893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8604999Z ok (5.174s) 2022-11-23T02:38:50.8605020Z 2022-11-23T02:38:50.8605293Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8605396Z Ran 1 test in 5.175s 2022-11-23T02:38:50.8605416Z 2022-11-23T02:38:50.8605513Z OK 2022-11-23T02:38:50.8605533Z 2022-11-23T02:38:50.8605661Z Generating XML reports... 2022-11-23T02:38:50.8606097Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023605.xml 2022-11-23T02:38:50.8606471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8606648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8607040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8607236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8607477Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6eioyxji 2022-11-23T02:38:50.8607752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6eioyxji/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8607771Z 2022-11-23T02:38:50.8607885Z Running tests... 2022-11-23T02:38:50.8608156Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8608474Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8608720Z test_gather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8608940Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80068 2022-11-23T02:38:50.8609217Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80069 2022-11-23T02:38:50.8609437Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80070 2022-11-23T02:38:50.8609636Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80071 2022-11-23T02:38:50.8610019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8610200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8610586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8610784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8611157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8611339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8611722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8611897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8612271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8612491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8612878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8613068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8613435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8613615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8613996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8614191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8614432Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0qz4ykev 2022-11-23T02:38:50.8614696Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsl3s3m23 2022-11-23T02:38:50.8614971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0qz4ykev/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8615239Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsl3s3m23/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8615499Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphcwhf1dl 2022-11-23T02:38:50.8615770Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphcwhf1dl/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8616004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8616243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8616455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8616710Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpotzwigkp 2022-11-23T02:38:50.8616984Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpotzwigkp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8617212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8617317Z ok (4.163s) 2022-11-23T02:38:50.8617337Z 2022-11-23T02:38:50.8617612Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8617729Z Ran 1 test in 4.163s 2022-11-23T02:38:50.8617748Z 2022-11-23T02:38:50.8617843Z OK 2022-11-23T02:38:50.8617862Z 2022-11-23T02:38:50.8617989Z Generating XML reports... 2022-11-23T02:38:50.8618469Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023612.xml 2022-11-23T02:38:50.8618844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8619026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8619414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8619610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8619867Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu01r62tn 2022-11-23T02:38:50.8620135Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu01r62tn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8620155Z 2022-11-23T02:38:50.8620266Z Running tests... 2022-11-23T02:38:50.8620514Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8620839Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8621107Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8621331Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80251 2022-11-23T02:38:50.8621607Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80252 2022-11-23T02:38:50.8621834Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80253 2022-11-23T02:38:50.8622052Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80254 2022-11-23T02:38:50.8622431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8622609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8622976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8623177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8623551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8623734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8624115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8624306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8624672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8624849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8625218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8625415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8625782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8625958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8626341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8626539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8626801Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprcy3gvvl 2022-11-23T02:38:50.8627078Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprcy3gvvl/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8627336Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb1le9965 2022-11-23T02:38:50.8627643Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb1le9965/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8627899Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3zbss8q6 2022-11-23T02:38:50.8628169Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3zbss8q6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8628430Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkkfukm3a 2022-11-23T02:38:50.8628702Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkkfukm3a/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8628934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8629164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8629387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8629603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8629708Z ok (4.165s) 2022-11-23T02:38:50.8629727Z 2022-11-23T02:38:50.8630003Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8630120Z Ran 1 test in 4.165s 2022-11-23T02:38:50.8630139Z 2022-11-23T02:38:50.8630235Z OK 2022-11-23T02:38:50.8630254Z 2022-11-23T02:38:50.8630424Z Generating XML reports... 2022-11-23T02:38:50.8630869Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023619.xml 2022-11-23T02:38:50.8631245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8631424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8631790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8631986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8632242Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpacht7gm3 2022-11-23T02:38:50.8632514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpacht7gm3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8632534Z 2022-11-23T02:38:50.8632646Z Running tests... 2022-11-23T02:38:50.8632921Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8633240Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8633486Z test_gather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8633687Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80434 2022-11-23T02:38:50.8633907Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80435 2022-11-23T02:38:50.8634125Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80436 2022-11-23T02:38:50.8634345Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80437 2022-11-23T02:38:50.8634721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8634901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8635498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8635693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8636064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8636221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8636597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8636886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8637252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8637428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8637807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8637999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8638375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8638533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8638908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8639106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8639366Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6wbpb_qg 2022-11-23T02:38:50.8639641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6wbpb_qg/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8639931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8640198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvkfr409u 2022-11-23T02:38:50.8640470Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvkfr409u/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8640724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiypsylyz 2022-11-23T02:38:50.8640979Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiypsylyz/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8641239Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo4ky2i8o 2022-11-23T02:38:50.8641514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo4ky2i8o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8641745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8641974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8642207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8642314Z ok (4.778s) 2022-11-23T02:38:50.8642335Z 2022-11-23T02:38:50.8642607Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8642704Z Ran 1 test in 4.778s 2022-11-23T02:38:50.8642723Z 2022-11-23T02:38:50.8642818Z OK 2022-11-23T02:38:50.8642837Z 2022-11-23T02:38:50.8642964Z Generating XML reports... 2022-11-23T02:38:50.8643402Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023625.xml 2022-11-23T02:38:50.8643783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8643964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8644347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8644545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8644803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9foa44ny 2022-11-23T02:38:50.8645055Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9foa44ny/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8645076Z 2022-11-23T02:38:50.8645185Z Running tests... 2022-11-23T02:38:50.8645457Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8645774Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8646088Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8646312Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80641 2022-11-23T02:38:50.8646532Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80642 2022-11-23T02:38:50.8646754Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80643 2022-11-23T02:38:50.8646956Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80644 2022-11-23T02:38:50.8647337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8647517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8647903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8648104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8648479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8648657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8649081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8649282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8649638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8649813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8650193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8650385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8650759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8650943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8651330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8651524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8651768Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp45kflj3k 2022-11-23T02:38:50.8652029Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplpv9se0q 2022-11-23T02:38:50.8652302Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp45kflj3k/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8652569Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplpv9se0q/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8652870Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzn3fsii5 2022-11-23T02:38:50.8653142Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzn3fsii5/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8653373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8653611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8653843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8654082Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpds2up8qk 2022-11-23T02:38:50.8654349Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpds2up8qk/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8654575Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8654681Z ok (7.359s) 2022-11-23T02:38:50.8654753Z 2022-11-23T02:38:50.8655034Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8655149Z Ran 1 test in 7.359s 2022-11-23T02:38:50.8655169Z 2022-11-23T02:38:50.8655265Z OK 2022-11-23T02:38:50.8655285Z 2022-11-23T02:38:50.8655413Z Generating XML reports... 2022-11-23T02:38:50.8655836Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023633.xml 2022-11-23T02:38:50.8656212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8656393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8656777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8656970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8657228Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbzwvqknn 2022-11-23T02:38:50.8657502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbzwvqknn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8657522Z 2022-11-23T02:38:50.8657634Z Running tests... 2022-11-23T02:38:50.8657903Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8658249Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8658528Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8658750Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80852 2022-11-23T02:38:50.8658969Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80853 2022-11-23T02:38:50.8659190Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80854 2022-11-23T02:38:50.8659408Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80855 2022-11-23T02:38:50.8659794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8659975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8660344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8660542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8660913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8661093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8661472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8661664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8662036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8662211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8662576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8662771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8663140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8663315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8663694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8663889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8664148Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxxi03ggc 2022-11-23T02:38:50.8664474Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxxi03ggc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8664732Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplcmiy_jc 2022-11-23T02:38:50.8664991Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplcmiy_jc/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8665224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8665481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb3cp70so 2022-11-23T02:38:50.8665750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb3cp70so/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8665980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8666211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8666471Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpssu3fnc_ 2022-11-23T02:38:50.8666746Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpssu3fnc_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8666976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8667063Z ok (4.162s) 2022-11-23T02:38:50.8667127Z 2022-11-23T02:38:50.8667409Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8667525Z Ran 1 test in 4.163s 2022-11-23T02:38:50.8667545Z 2022-11-23T02:38:50.8667640Z OK 2022-11-23T02:38:50.8667660Z 2022-11-23T02:38:50.8667786Z Generating XML reports... 2022-11-23T02:38:50.8668226Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023642.xml 2022-11-23T02:38:50.8668607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8668793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8669166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8669359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8669619Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl42o7df2 2022-11-23T02:38:50.8669890Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl42o7df2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8669911Z 2022-11-23T02:38:50.8670023Z Running tests... 2022-11-23T02:38:50.8670293Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8670607Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8670858Z test_reduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8671065Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81039 2022-11-23T02:38:50.8671287Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81040 2022-11-23T02:38:50.8671508Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81041 2022-11-23T02:38:50.8671729Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81042 2022-11-23T02:38:50.8672107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8672286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8672672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8672866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8673238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8673511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8673891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8674085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8674455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8674635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8675005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8675412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8675802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8675983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8676364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8676557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8676892Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsmd0t01s 2022-11-23T02:38:50.8677176Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsmd0t01s/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8677408Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8677666Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc4hxiabf 2022-11-23T02:38:50.8677943Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc4hxiabf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8678204Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0557enys 2022-11-23T02:38:50.8678455Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0557enys/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8678715Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyywbs_v0 2022-11-23T02:38:50.8678990Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyywbs_v0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8679224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8679455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8679681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8679786Z ok (4.148s) 2022-11-23T02:38:50.8679807Z 2022-11-23T02:38:50.8680081Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8680181Z Ran 1 test in 4.148s 2022-11-23T02:38:50.8680219Z 2022-11-23T02:38:50.8680296Z OK 2022-11-23T02:38:50.8680315Z 2022-11-23T02:38:50.8680442Z Generating XML reports... 2022-11-23T02:38:50.8680885Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023649.xml 2022-11-23T02:38:50.8681262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8681443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8681827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8682023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8682280Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmposlf8c66 2022-11-23T02:38:50.8682537Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmposlf8c66/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8682629Z 2022-11-23T02:38:50.8682751Z Running tests... 2022-11-23T02:38:50.8683020Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8683335Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8683596Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8683822Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81222 2022-11-23T02:38:50.8684042Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81223 2022-11-23T02:38:50.8684261Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81224 2022-11-23T02:38:50.8684461Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81225 2022-11-23T02:38:50.8684843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8685026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8685412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8685607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8686047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8686231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8686612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8686805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8687156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8687341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8687718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8687910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8688285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8688462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8688848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8689038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8689278Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf_jhe5kf 2022-11-23T02:38:50.8689553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf_jhe5kf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8689788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8690047Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppgna_4t1 2022-11-23T02:38:50.8690321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppgna_4t1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8690555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8690814Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp11kf5up6 2022-11-23T02:38:50.8691085Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp11kf5up6/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8691315Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8691553Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8tb65wht 2022-11-23T02:38:50.8691878Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8tb65wht/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8692105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8692209Z ok (5.223s) 2022-11-23T02:38:50.8692230Z 2022-11-23T02:38:50.8692504Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8692626Z Ran 1 test in 5.223s 2022-11-23T02:38:50.8692646Z 2022-11-23T02:38:50.8692741Z OK 2022-11-23T02:38:50.8692760Z 2022-11-23T02:38:50.8692886Z Generating XML reports... 2022-11-23T02:38:50.8693306Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023655.xml 2022-11-23T02:38:50.8693684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8693864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8694251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8694444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8694700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_3jr0qt8 2022-11-23T02:38:50.8695015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_3jr0qt8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8695037Z 2022-11-23T02:38:50.8695155Z Running tests... 2022-11-23T02:38:50.8695422Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8695721Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8695969Z test_reduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8696189Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81409 2022-11-23T02:38:50.8696413Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81410 2022-11-23T02:38:50.8696632Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81411 2022-11-23T02:38:50.8696853Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81412 2022-11-23T02:38:50.8697236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8697419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8697787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8697982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8698354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8698533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8698915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8699107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8699471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8699651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8700034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8700206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8700571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8700747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8701185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8701378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8701642Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyg1ocr5p 2022-11-23T02:38:50.8701922Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyg1ocr5p/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8702183Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplkohyc5c 2022-11-23T02:38:50.8702438Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplkohyc5c/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8702670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8702896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8703153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_f1d6t57 2022-11-23T02:38:50.8703428Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_f1d6t57/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8703656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8703912Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqnscb3s3 2022-11-23T02:38:50.8704236Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqnscb3s3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8704474Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8704561Z ok (4.156s) 2022-11-23T02:38:50.8704580Z 2022-11-23T02:38:50.8704855Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8704970Z Ran 1 test in 4.156s 2022-11-23T02:38:50.8704989Z 2022-11-23T02:38:50.8705083Z OK 2022-11-23T02:38:50.8705102Z 2022-11-23T02:38:50.8705237Z Generating XML reports... 2022-11-23T02:38:50.8705676Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023703.xml 2022-11-23T02:38:50.8706056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8706239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8706613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8706808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8707064Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppb863eld 2022-11-23T02:38:50.8707335Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppb863eld/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8707356Z 2022-11-23T02:38:50.8707466Z Running tests... 2022-11-23T02:38:50.8707741Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8708060Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8708307Z test_reduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8708529Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81592 2022-11-23T02:38:50.8708735Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81593 2022-11-23T02:38:50.8708959Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81594 2022-11-23T02:38:50.8709173Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81595 2022-11-23T02:38:50.8709550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8709730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8710115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8710367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8710737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8710899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8711281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8711474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8711844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8712022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8712397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8712595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8712970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8713148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8713550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8713749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8714014Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm9ofs9kq 2022-11-23T02:38:50.8714287Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm9ofs9kq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8714546Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnjta_eql 2022-11-23T02:38:50.8714823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnjta_eql/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8715295Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpspgz5b08 2022-11-23T02:38:50.8715573Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpspgz5b08/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8715816Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb77naean 2022-11-23T02:38:50.8716086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb77naean/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8716318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8716549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8716781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8717015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8717120Z ok (4.557s) 2022-11-23T02:38:50.8717141Z 2022-11-23T02:38:50.8717457Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8717573Z Ran 1 test in 4.557s 2022-11-23T02:38:50.8717593Z 2022-11-23T02:38:50.8717669Z OK 2022-11-23T02:38:50.8717688Z 2022-11-23T02:38:50.8717818Z Generating XML reports... 2022-11-23T02:38:50.8718259Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023709.xml 2022-11-23T02:38:50.8718637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8718816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8719203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8719484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8719744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfoe93e9h 2022-11-23T02:38:50.8719999Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfoe93e9h/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8720039Z 2022-11-23T02:38:50.8720132Z Running tests... 2022-11-23T02:38:50.8720405Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8720724Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8720980Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8721207Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81799 2022-11-23T02:38:50.8721430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81800 2022-11-23T02:38:50.8721649Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81801 2022-11-23T02:38:50.8721854Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81802 2022-11-23T02:38:50.8722235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8722413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8722861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8723062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8723440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8723621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8723990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8724175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8724543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8724740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8725128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8725326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8725702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8725883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8726266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8726463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8726719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi1bb0d3f 2022-11-23T02:38:50.8726976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi1bb0d3f/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8727236Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdfxo2vmf 2022-11-23T02:38:50.8727511Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdfxo2vmf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8727771Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpptpkhz6r 2022-11-23T02:38:50.8728044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpptpkhz6r/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8728300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpytuswv16 2022-11-23T02:38:50.8728571Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpytuswv16/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8728859Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8729067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8729298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8729529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8729639Z ok (6.168s) 2022-11-23T02:38:50.8729659Z 2022-11-23T02:38:50.8729931Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8730049Z Ran 1 test in 6.168s 2022-11-23T02:38:50.8730068Z 2022-11-23T02:38:50.8730169Z OK 2022-11-23T02:38:50.8730187Z 2022-11-23T02:38:50.8730317Z Generating XML reports... 2022-11-23T02:38:50.8730737Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023716.xml 2022-11-23T02:38:50.8731123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8731303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8731690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8731929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8732194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1akeqtvf 2022-11-23T02:38:50.8732470Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1akeqtvf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8732490Z 2022-11-23T02:38:50.8732607Z Running tests... 2022-11-23T02:38:50.8732878Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8733177Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8733425Z test_round_robin (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8733647Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82010 2022-11-23T02:38:50.8733871Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82011 2022-11-23T02:38:50.8734097Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82012 2022-11-23T02:38:50.8734315Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82013 2022-11-23T02:38:50.8734693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8734876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8735247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8735440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8735816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8735994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8736378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8736571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8736938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8737114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8737497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8737670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8738096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8738273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8738653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8738849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8739113Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpalw859cq 2022-11-23T02:38:50.8739387Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpalw859cq/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8739619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8739859Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpijgqgnce 2022-11-23T02:38:50.8740133Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpijgqgnce/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8740384Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_2rgn4my 2022-11-23T02:38:50.8740648Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_2rgn4my/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8740948Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptqne1gax 2022-11-23T02:38:50.8741228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptqne1gax/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8741461Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8741686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8741914Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8742477Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8743022Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8743574Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8744115Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8744225Z ok (4.214s) 2022-11-23T02:38:50.8744245Z 2022-11-23T02:38:50.8744521Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8744640Z Ran 1 test in 4.214s 2022-11-23T02:38:50.8744660Z 2022-11-23T02:38:50.8744757Z OK 2022-11-23T02:38:50.8744776Z 2022-11-23T02:38:50.8744900Z Generating XML reports... 2022-11-23T02:38:50.8745341Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023725.xml 2022-11-23T02:38:50.8745721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8745904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8746333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8746529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8746792Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpht6l1gel 2022-11-23T02:38:50.8747070Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpht6l1gel/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8747090Z 2022-11-23T02:38:50.8747204Z Running tests... 2022-11-23T02:38:50.8747475Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8747793Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8748061Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8748285Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82205 2022-11-23T02:38:50.8748492Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82206 2022-11-23T02:38:50.8748712Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82207 2022-11-23T02:38:50.8748931Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82208 2022-11-23T02:38:50.8749365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8749551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8749943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8750141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8750510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8750671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8751054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8751248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8751618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8751797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8752184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8752377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8752784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8752942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8753326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8753520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8753782Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmlao9udb 2022-11-23T02:38:50.8754065Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmlao9udb/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8754323Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqvrjnvsn 2022-11-23T02:38:50.8754595Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqvrjnvsn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8754853Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr1s5mn2_ 2022-11-23T02:38:50.8755343Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr1s5mn2_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8755566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8755885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8756143Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzzq10lh9 2022-11-23T02:38:50.8756418Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzzq10lh9/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8756653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8756880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8757439Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8757995Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8758597Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8759146Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8759691Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8760237Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8760778Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8761315Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:38:50.8761428Z ok (4.356s) 2022-11-23T02:38:50.8761450Z 2022-11-23T02:38:50.8761733Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8761835Z Ran 1 test in 4.356s 2022-11-23T02:38:50.8761855Z 2022-11-23T02:38:50.8761952Z OK 2022-11-23T02:38:50.8761970Z 2022-11-23T02:38:50.8762097Z Generating XML reports... 2022-11-23T02:38:50.8762536Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023731.xml 2022-11-23T02:38:50.8762915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8763096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8763545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8763742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8763984Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbgsyfcys 2022-11-23T02:38:50.8764261Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbgsyfcys/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8764282Z 2022-11-23T02:38:50.8764395Z Running tests... 2022-11-23T02:38:50.8764664Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8764982Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8765238Z test_scatter_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8765463Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82424 2022-11-23T02:38:50.8765690Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82425 2022-11-23T02:38:50.8765895Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82426 2022-11-23T02:38:50.8766115Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82427 2022-11-23T02:38:50.8766546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8766734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8767124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8767320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8767689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8767868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8768252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8768426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8768790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8768970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8769355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8769546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8769912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8770088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8770471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8770645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8770909Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkl4zapjv 2022-11-23T02:38:50.8771188Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkl4zapjv/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8771425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8771686Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsuhozc_l 2022-11-23T02:38:50.8771968Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsuhozc_l/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8772198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8772461Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw5048ch_ 2022-11-23T02:38:50.8772788Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw5048ch_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8773000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8773260Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp024a1l5i 2022-11-23T02:38:50.8773526Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp024a1l5i/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8773758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8773864Z ok (4.144s) 2022-11-23T02:38:50.8773884Z 2022-11-23T02:38:50.8774159Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8774274Z Ran 1 test in 4.144s 2022-11-23T02:38:50.8774293Z 2022-11-23T02:38:50.8774391Z OK 2022-11-23T02:38:50.8774410Z 2022-11-23T02:38:50.8774523Z Generating XML reports... 2022-11-23T02:38:50.8774960Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023738.xml 2022-11-23T02:38:50.8775336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8775517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8775951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8776154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8776411Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkvibmscl 2022-11-23T02:38:50.8776681Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkvibmscl/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8776701Z 2022-11-23T02:38:50.8776811Z Running tests... 2022-11-23T02:38:50.8777065Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8777378Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8777636Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8777863Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82607 2022-11-23T02:38:50.8778086Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82608 2022-11-23T02:38:50.8778308Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82609 2022-11-23T02:38:50.8778522Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82610 2022-11-23T02:38:50.8778901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8779062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8779456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8779650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8780023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8780204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8780587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8780783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8781150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8781324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8781682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8781931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8782307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8782483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8782868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8783059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8783319Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf9vf68ig 2022-11-23T02:38:50.8783592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf9vf68ig/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8783835Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu4t10gzp 2022-11-23T02:38:50.8784112Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu4t10gzp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8784368Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp00mzfl8o 2022-11-23T02:38:50.8784601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8784912Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp00mzfl8o/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8785145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8785376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8785630Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_x0f3_re 2022-11-23T02:38:50.8785902Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_x0f3_re/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8786114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8786220Z ok (5.158s) 2022-11-23T02:38:50.8786239Z 2022-11-23T02:38:50.8786511Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8786627Z Ran 1 test in 5.158s 2022-11-23T02:38:50.8786646Z 2022-11-23T02:38:50.8786742Z OK 2022-11-23T02:38:50.8786761Z 2022-11-23T02:38:50.8786891Z Generating XML reports... 2022-11-23T02:38:50.8787332Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023744.xml 2022-11-23T02:38:50.8787713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8787873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8788264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8788464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8788721Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp54zg84m7 2022-11-23T02:38:50.8788992Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp54zg84m7/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8789011Z 2022-11-23T02:38:50.8789122Z Running tests... 2022-11-23T02:38:50.8789396Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8789717Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8789963Z test_scatter_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8790187Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82794 2022-11-23T02:38:50.8790409Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82795 2022-11-23T02:38:50.8790627Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82796 2022-11-23T02:38:50.8790899Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82797 2022-11-23T02:38:50.8791264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8791446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8791838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8792035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8792408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8792586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8792968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8793166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8793536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8793695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8794121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8794319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8794690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8794865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8795452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8795651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8795913Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkbfq0ro8 2022-11-23T02:38:50.8796170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkbfq0ro8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8796406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8796662Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdxxz0771 2022-11-23T02:38:50.8796927Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdxxz0771/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8797156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8797411Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpixzxb5_k 2022-11-23T02:38:50.8797682Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpixzxb5_k/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8797941Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7f1gn3gf 2022-11-23T02:38:50.8798209Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7f1gn3gf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8798417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8798652Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8798758Z ok (4.213s) 2022-11-23T02:38:50.8798778Z 2022-11-23T02:38:50.8799053Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8799168Z Ran 1 test in 4.213s 2022-11-23T02:38:50.8799187Z 2022-11-23T02:38:50.8799283Z OK 2022-11-23T02:38:50.8799302Z 2022-11-23T02:38:50.8799428Z Generating XML reports... 2022-11-23T02:38:50.8799867Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023752.xml 2022-11-23T02:38:50.8800314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8800496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8800887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8801082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8801338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp92b6t5qg 2022-11-23T02:38:50.8801604Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp92b6t5qg/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8801624Z 2022-11-23T02:38:50.8801735Z Running tests... 2022-11-23T02:38:50.8802006Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8802324Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8802559Z test_scatter_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8802777Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82977 2022-11-23T02:38:50.8802996Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82978 2022-11-23T02:38:50.8803274Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82979 2022-11-23T02:38:50.8803501Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82980 2022-11-23T02:38:50.8803878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8804056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8804440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8804621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8804989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8805168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8805555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8805745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8806109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8806286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8806667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8806857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8807208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8807385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8807763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8807960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8808220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3olnukrp 2022-11-23T02:38:50.8808494Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3olnukrp/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8808747Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ckmsxa1 2022-11-23T02:38:50.8809018Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ckmsxa1/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8809289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8809541Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzup90o8v 2022-11-23T02:38:50.8809808Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzup90o8v/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8810041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8810269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8810516Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_4lqgztn 2022-11-23T02:38:50.8810787Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_4lqgztn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8811015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8811119Z ok (4.960s) 2022-11-23T02:38:50.8811144Z 2022-11-23T02:38:50.8811401Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8811515Z Ran 1 test in 4.960s 2022-11-23T02:38:50.8811534Z 2022-11-23T02:38:50.8811629Z OK 2022-11-23T02:38:50.8811649Z 2022-11-23T02:38:50.8811776Z Generating XML reports... 2022-11-23T02:38:50.8812319Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023759.xml 2022-11-23T02:38:50.8812708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8812887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8813274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8813449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8813704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyyqvuoci 2022-11-23T02:38:50.8813983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyyqvuoci/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8814003Z 2022-11-23T02:38:50.8814114Z Running tests... 2022-11-23T02:38:50.8814383Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8814709Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8815030Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) ... skip: Test is flaky, see https://github.com/pytorch/pytorch/issues/15963 (0.001s) 2022-11-23T02:38:50.8815049Z 2022-11-23T02:38:50.8815316Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8815431Z Ran 1 test in 0.001s 2022-11-23T02:38:50.8815450Z 2022-11-23T02:38:50.8815540Z OK (skipped=1) 2022-11-23T02:38:50.8815560Z 2022-11-23T02:38:50.8815686Z Generating XML reports... 2022-11-23T02:38:50.8816123Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023806.xml 2022-11-23T02:38:50.8816500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8816679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8817070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8817264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8817522Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5lgxgmh8 2022-11-23T02:38:50.8817798Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5lgxgmh8/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8817818Z 2022-11-23T02:38:50.8817912Z Running tests... 2022-11-23T02:38:50.8818181Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8818553Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8818805Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8819025Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83217 2022-11-23T02:38:50.8819249Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83218 2022-11-23T02:38:50.8819466Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83219 2022-11-23T02:38:50.8819682Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83220 2022-11-23T02:38:50.8820042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8820221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8820603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8820800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8821173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8821348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8821773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8821971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8822342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8822500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8822873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8823069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8823440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8823617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8823997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8824190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8824446Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9nwv0v6q 2022-11-23T02:38:50.8824699Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9nwv0v6q/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8824931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8825194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpttplyqr2 2022-11-23T02:38:50.8825471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpttplyqr2/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8825723Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1unomkfm 2022-11-23T02:38:50.8825995Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1unomkfm/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8826225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8826447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8826703Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8qgx1q6s 2022-11-23T02:38:50.8826955Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8qgx1q6s/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8827184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8827345Z ok (4.187s) 2022-11-23T02:38:50.8827366Z 2022-11-23T02:38:50.8827638Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8827753Z Ran 1 test in 4.187s 2022-11-23T02:38:50.8827772Z 2022-11-23T02:38:50.8827866Z OK 2022-11-23T02:38:50.8827886Z 2022-11-23T02:38:50.8828012Z Generating XML reports... 2022-11-23T02:38:50.8828462Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023808.xml 2022-11-23T02:38:50.8828825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8829004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8829385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8829580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8829841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5adcd_sy 2022-11-23T02:38:50.8830114Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5adcd_sy/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8830133Z 2022-11-23T02:38:50.8830244Z Running tests... 2022-11-23T02:38:50.8830559Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8830884Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8831146Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) ... skip: intermittent failures on Windows, in CI (0.000s) 2022-11-23T02:38:50.8831166Z 2022-11-23T02:38:50.8831427Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8831542Z Ran 1 test in 0.001s 2022-11-23T02:38:50.8831561Z 2022-11-23T02:38:50.8831670Z OK (skipped=1) 2022-11-23T02:38:50.8831689Z 2022-11-23T02:38:50.8831821Z Generating XML reports... 2022-11-23T02:38:50.8832257Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023815.xml 2022-11-23T02:38:50.8832634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8832817Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8833186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8833380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8833637Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1vgl2msf 2022-11-23T02:38:50.8833912Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1vgl2msf/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8833932Z 2022-11-23T02:38:50.8834043Z Running tests... 2022-11-23T02:38:50.8834313Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8834628Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8834904Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8835335Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83433 2022-11-23T02:38:50.8835547Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83434 2022-11-23T02:38:50.8835761Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83435 2022-11-23T02:38:50.8835975Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83436 2022-11-23T02:38:50.8836356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8836537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8837012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8837209Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8837577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8837740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8838116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8838293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8838678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8838873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8839253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8839450Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8839821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8840057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8840429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8840623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8840884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzyrtq83f 2022-11-23T02:38:50.8841161Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzyrtq83f/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8841417Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8vb1q_cx 2022-11-23T02:38:50.8841690Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8vb1q_cx/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8841943Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8x_tbrb0 2022-11-23T02:38:50.8842212Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8x_tbrb0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8842430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8842688Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphi805pwa 2022-11-23T02:38:50.8842956Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphi805pwa/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8843188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8843419Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8843649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8843754Z ok (5.227s) 2022-11-23T02:38:50.8843774Z 2022-11-23T02:38:50.8844046Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8844163Z Ran 1 test in 5.227s 2022-11-23T02:38:50.8844182Z 2022-11-23T02:38:50.8844260Z OK 2022-11-23T02:38:50.8844283Z 2022-11-23T02:38:50.8844412Z Generating XML reports... 2022-11-23T02:38:50.8844855Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023817.xml 2022-11-23T02:38:50.8845235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8845419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8845806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8846068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8846330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn5990qba 2022-11-23T02:38:50.8846582Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn5990qba/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8846619Z 2022-11-23T02:38:50.8846717Z Running tests... 2022-11-23T02:38:50.8846988Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8847304Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8847571Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8847788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83800 2022-11-23T02:38:50.8848007Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83801 2022-11-23T02:38:50.8848229Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83802 2022-11-23T02:38:50.8848446Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83803 2022-11-23T02:38:50.8848809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8849037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8849435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8849628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8850001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8850178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8850560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8850762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8851112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8851287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8851666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8851858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8852229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8852404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8852819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8853015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8853276Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsdangq7p 2022-11-23T02:38:50.8853535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsdangq7p/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8853798Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpurx5ib05 2022-11-23T02:38:50.8854072Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpurx5ib05/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8854306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:38:50.8854567Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpetk3qh2_ 2022-11-23T02:38:50.8854836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpetk3qh2_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8855130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:38:50.8855359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:38:50.8855597Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5_21v5br 2022-11-23T02:38:50.8855869Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5_21v5br/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8856098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:38:50.8856203Z ok (4.110s) 2022-11-23T02:38:50.8856222Z 2022-11-23T02:38:50.8856494Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8856608Z Ran 1 test in 4.110s 2022-11-23T02:38:50.8856627Z 2022-11-23T02:38:50.8856722Z OK 2022-11-23T02:38:50.8856741Z 2022-11-23T02:38:50.8856869Z Generating XML reports... 2022-11-23T02:38:50.8857309Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023824.xml 2022-11-23T02:38:50.8857671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8857848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8858279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8858481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8858740Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1xeqdgk_ 2022-11-23T02:38:50.8859012Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1xeqdgk_/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8859032Z 2022-11-23T02:38:50.8859143Z Running tests... 2022-11-23T02:38:50.8859413Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8859717Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8859900Z test_forward_backward (__main__.ReducerTest) ... ok (0.008s) 2022-11-23T02:38:50.8859919Z 2022-11-23T02:38:50.8860183Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8860298Z Ran 1 test in 0.012s 2022-11-23T02:38:50.8860319Z 2022-11-23T02:38:50.8860412Z OK 2022-11-23T02:38:50.8860436Z 2022-11-23T02:38:50.8860562Z Generating XML reports... 2022-11-23T02:38:50.8860965Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023831.xml 2022-11-23T02:38:50.8861341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8861518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8861883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8862083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8862340Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmhjnpcq0 2022-11-23T02:38:50.8862614Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmhjnpcq0/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8862634Z 2022-11-23T02:38:50.8862748Z Running tests... 2022-11-23T02:38:50.8863016Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8863335Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8864191Z test_forward_backward_optimizer (__main__.ReducerTest) ... [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:38:50.8864357Z ok (0.012s) 2022-11-23T02:38:50.8864377Z 2022-11-23T02:38:50.8864644Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8864744Z Ran 1 test in 0.022s 2022-11-23T02:38:50.8864782Z 2022-11-23T02:38:50.8864860Z OK 2022-11-23T02:38:50.8864879Z 2022-11-23T02:38:50.8865006Z Generating XML reports... 2022-11-23T02:38:50.8865404Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023833.xml 2022-11-23T02:38:50.8865779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8865958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8866351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8866546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8866807Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpliemowkk 2022-11-23T02:38:50.8867122Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpliemowkk/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8867144Z 2022-11-23T02:38:50.8867263Z Running tests... 2022-11-23T02:38:50.8867528Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8867848Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8868057Z test_forward_backward_unused_parameters (__main__.ReducerTest) ... ok (0.008s) 2022-11-23T02:38:50.8868076Z 2022-11-23T02:38:50.8868342Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8868462Z Ran 1 test in 0.012s 2022-11-23T02:38:50.8868481Z 2022-11-23T02:38:50.8868576Z OK 2022-11-23T02:38:50.8868596Z 2022-11-23T02:38:50.8868703Z Generating XML reports... 2022-11-23T02:38:50.8869103Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023835.xml 2022-11-23T02:38:50.8869483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8869664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8870049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8870243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8870502Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbi8r9hcn 2022-11-23T02:38:50.8870778Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbi8r9hcn/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8870801Z 2022-11-23T02:38:50.8870914Z Running tests... 2022-11-23T02:38:50.8871166Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8871486Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8871672Z test_multi_dtype_multi_bucket (__main__.ReducerTest) ... ok (0.004s) 2022-11-23T02:38:50.8871694Z 2022-11-23T02:38:50.8871957Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8872071Z Ran 1 test in 0.012s 2022-11-23T02:38:50.8872090Z 2022-11-23T02:38:50.8872184Z OK 2022-11-23T02:38:50.8872204Z 2022-11-23T02:38:50.8872330Z Generating XML reports... 2022-11-23T02:38:50.8872731Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023837.xml 2022-11-23T02:38:50.8873105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8873324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8873709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8873903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8874168Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph1lobfzy 2022-11-23T02:38:50.8874444Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph1lobfzy/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8874464Z 2022-11-23T02:38:50.8874575Z Running tests... 2022-11-23T02:38:50.8874843Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8875441Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8875624Z test_multi_dtype_single_bucket (__main__.ReducerTest) ... ok (0.006s) 2022-11-23T02:38:50.8875649Z 2022-11-23T02:38:50.8875919Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8876032Z Ran 1 test in 0.012s 2022-11-23T02:38:50.8876052Z 2022-11-23T02:38:50.8876147Z OK 2022-11-23T02:38:50.8876166Z 2022-11-23T02:38:50.8876291Z Generating XML reports... 2022-11-23T02:38:50.8876778Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023839.xml 2022-11-23T02:38:50.8877169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8877351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8877735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8877912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8878172Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpulyjmx01 2022-11-23T02:38:50.8878448Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpulyjmx01/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8878469Z 2022-11-23T02:38:50.8878585Z Running tests... 2022-11-23T02:38:50.8878853Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8879173Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8879368Z test_single_dtype_single_bucket (__main__.ReducerTest) ... ok (0.003s) 2022-11-23T02:38:50.8879387Z 2022-11-23T02:38:50.8879648Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8879744Z Ran 1 test in 0.011s 2022-11-23T02:38:50.8879764Z 2022-11-23T02:38:50.8879859Z OK 2022-11-23T02:38:50.8879878Z 2022-11-23T02:38:50.8880004Z Generating XML reports... 2022-11-23T02:38:50.8880397Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023842.xml 2022-11-23T02:38:50.8880778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8880957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8881342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8881541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8881803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp5llqt3g 2022-11-23T02:38:50.8882062Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp5llqt3g/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8882083Z 2022-11-23T02:38:50.8882195Z Running tests... 2022-11-23T02:38:50.8882464Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8882779Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8883087Z test_logging_init (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8883340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:38:50.8883753Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:38:50.8883863Z ok (1.661s) 2022-11-23T02:38:50.8883883Z 2022-11-23T02:38:50.8884131Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8884245Z Ran 1 test in 1.661s 2022-11-23T02:38:50.8884265Z 2022-11-23T02:38:50.8884359Z OK 2022-11-23T02:38:50.8884379Z 2022-11-23T02:38:50.8884505Z Generating XML reports... 2022-11-23T02:38:50.8884925Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20221123023844.xml 2022-11-23T02:38:50.8885300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:38:50.8885488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:38:50.8885875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:38:50.8886070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:38:50.8886358Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7dhmpp3 2022-11-23T02:38:50.8886644Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7dhmpp3/_remote_module_non_scriptable.py 2022-11-23T02:38:50.8886663Z 2022-11-23T02:38:50.8886776Z Running tests... 2022-11-23T02:38:50.8887044Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8887360Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:38:50.8887601Z test_default_store_timeout_gloo (__main__.TimeoutTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:38:50.8888358Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/74714 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.660s) 2022-11-23T02:38:50.8888379Z 2022-11-23T02:38:50.8888653Z ---------------------------------------------------------------------- 2022-11-23T02:38:50.8888772Z Ran 1 test in 1.661s 2022-11-23T02:38:50.8888791Z 2022-11-23T02:38:50.8888882Z OK (skipped=1) 2022-11-23T02:38:50.8888919Z 2022-11-23T02:38:50.8889026Z Generating XML reports... 2022-11-23T02:38:50.8889426Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20221123023848.xml 2022-11-23T02:38:50.8889446Z 2022-11-23T02:38:50.8889877Z ##[endgroup] 2022-11-23T02:38:50.8890315Z FINISHED PRINTING LOG FILE of distributed/test_c10d_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_gloo_bc12qwi7) 2022-11-23T02:38:50.8890340Z 2022-11-23T02:38:50.8890607Z Running distributed/fsdp/test_fsdp_core ... [2022-11-23 02:38:50.730216] 2022-11-23T02:38:50.8891114Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:38:50.730551] 2022-11-23T02:48:20.0329042Z 2022-11-23T02:48:20.0329643Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_core 2022-11-23T02:48:20.0333688Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_core (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_core_xt0conde) 2022-11-23T02:48:20.0361623Z 2022-11-23T02:48:20.0362046Z Running tests... 2022-11-23T02:48:20.0362614Z ---------------------------------------------------------------------- 2022-11-23T02:48:20.0363192Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_core 2022-11-23T02:48:20.0363993Z test_pre_backward_hook_registration_after_state_dict (__main__.TestHooks) 2022-11-23T02:48:20.0366559Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:20.0367134Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84323 2022-11-23T02:48:20.0367776Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84324 2022-11-23T02:48:20.0369599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0370268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0371159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0371660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0372511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0373017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0374464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0375458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0376678Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0377800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0378652Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0379389Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0379923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0380403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0381706Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0382622Z warnings.warn( 2022-11-23T02:48:20.0384461Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0385239Z warnings.warn( 2022-11-23T02:48:20.0385477Z dist init r=0, world=2 2022-11-23T02:48:20.0385733Z dist init r=1, world=2 2022-11-23T02:48:20.0385978Z ok (5.631s) 2022-11-23T02:48:20.0386298Z test_pre_backward_hook_registration_cuda_first_False (__main__.TestHooks) 2022-11-23T02:48:20.0387476Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84406 2022-11-23T02:48:20.0388348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84407 2022-11-23T02:48:20.0389519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0390487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0391782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0392695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0393779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0394641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0396193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0397032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0397894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0398854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0400091Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0401362Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0402362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0403450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0405923Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0407405Z warnings.warn( 2022-11-23T02:48:20.0409662Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0411140Z warnings.warn( 2022-11-23T02:48:20.0411557Z dist init r=1, world=2 2022-11-23T02:48:20.0413057Z dist init r=0, world=2 2022-11-23T02:48:20.0413494Z ok (4.011s) 2022-11-23T02:48:20.0414104Z test_pre_backward_hook_registration_cuda_first_True (__main__.TestHooks) 2022-11-23T02:48:20.0415326Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84489 2022-11-23T02:48:20.0416328Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84490 2022-11-23T02:48:20.0417544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0418339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0419412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0420317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0421392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0422223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0423282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0424171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0425206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0426094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0427384Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0428717Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0429689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0430569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0431243Z dist init r=1, world=2 2022-11-23T02:48:20.0431735Z dist init r=0, world=2 2022-11-23T02:48:20.0432226Z ok (3.910s) 2022-11-23T02:48:20.0432837Z test_register_functions_called_cuda_first_False_mixed_precision_False (__main__.TestHooks) 2022-11-23T02:48:20.0433937Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84572 2022-11-23T02:48:20.0434982Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84573 2022-11-23T02:48:20.0436758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0437596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0438682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0439568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0440650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0441524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0442602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0443447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0444252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0445178Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0446431Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0447730Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0448681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0449605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0452023Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0453527Z warnings.warn( 2022-11-23T02:48:20.0455837Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0457479Z warnings.warn( 2022-11-23T02:48:20.0457907Z dist init r=0, world=2 2022-11-23T02:48:20.0458398Z dist init r=1, world=2 2022-11-23T02:48:20.0458877Z ok (3.911s) 2022-11-23T02:48:20.0459484Z test_register_functions_called_cuda_first_False_mixed_precision_True (__main__.TestHooks) 2022-11-23T02:48:20.0460570Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84651 2022-11-23T02:48:20.0461596Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84652 2022-11-23T02:48:20.0462761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0463596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0464677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0465560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0466636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0467498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0468680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0469550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0470368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0471251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0472470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0473798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0474743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0476031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0478187Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0479524Z warnings.warn( 2022-11-23T02:48:20.0481797Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0483239Z warnings.warn( 2022-11-23T02:48:20.0485231Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0486561Z warnings.warn( 2022-11-23T02:48:20.0488806Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0490410Z warnings.warn( 2022-11-23T02:48:20.0490834Z dist init r=0, world=2 2022-11-23T02:48:20.0491317Z dist init r=1, world=2 2022-11-23T02:48:20.0491751Z ok (3.911s) 2022-11-23T02:48:20.0492404Z test_register_functions_called_cuda_first_True_mixed_precision_False (__main__.TestHooks) 2022-11-23T02:48:20.0493421Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84730 2022-11-23T02:48:20.0494384Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84731 2022-11-23T02:48:20.0495549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0496381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0497438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0498342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0499573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0500514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0501581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0502449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0503281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0504212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0505434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0506772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0507682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0508569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0509221Z dist init r=1, world=2 2022-11-23T02:48:20.0509689Z dist init r=0, world=2 2022-11-23T02:48:20.0510115Z ok (3.911s) 2022-11-23T02:48:20.0510734Z test_register_functions_called_cuda_first_True_mixed_precision_True (__main__.TestHooks) 2022-11-23T02:48:20.0511739Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84809 2022-11-23T02:48:20.0512765Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84810 2022-11-23T02:48:20.0513910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0514723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0516217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0517069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0518159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0518992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0520141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0521202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0522137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0523204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0524464Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0525796Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0526866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0527811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0529914Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0531229Z warnings.warn( 2022-11-23T02:48:20.0533273Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0534678Z warnings.warn( 2022-11-23T02:48:20.0535129Z dist init r=1, world=2 2022-11-23T02:48:20.0535584Z dist init r=0, world=2 2022-11-23T02:48:20.0536031Z ok (3.911s) 2022-11-23T02:48:20.0536609Z test_transformer_no_grad_mixed_precision_False (__main__.TestNoGrad) 2022-11-23T02:48:20.0537812Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84888 2022-11-23T02:48:20.0538861Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84889 2022-11-23T02:48:20.0539983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0540828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0541901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0542798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0543907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0544747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0545851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0546754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0547627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0548553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0549933Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0551084Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0551770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0552768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0555888Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0557400Z warnings.warn( 2022-11-23T02:48:20.0559710Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0561143Z warnings.warn( 2022-11-23T02:48:20.0561586Z dist init r=1, world=2 2022-11-23T02:48:20.0562012Z dist init r=0, world=2 2022-11-23T02:48:20.0562426Z ok (4.011s) 2022-11-23T02:48:20.0563056Z test_transformer_no_grad_mixed_precision_True (__main__.TestNoGrad) 2022-11-23T02:48:20.0564385Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84971 2022-11-23T02:48:20.0565395Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84972 2022-11-23T02:48:20.0566548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0567359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0568430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0569366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0570495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0571302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0572408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0573356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0574165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0575075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0576273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0577541Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0578539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0579422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0581556Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0582950Z warnings.warn( 2022-11-23T02:48:20.0584882Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0586355Z warnings.warn( 2022-11-23T02:48:20.0588634Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0590113Z warnings.warn( 2022-11-23T02:48:20.0592371Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0593876Z warnings.warn( 2022-11-23T02:48:20.0594347Z dist init r=1, world=2 2022-11-23T02:48:20.0594781Z dist init r=0, world=2 2022-11-23T02:48:20.0595829Z ok (4.011s) 2022-11-23T02:48:20.0596445Z test_param_change_after_init_mixed_precision_False (__main__.TestParamInit) 2022-11-23T02:48:20.0597768Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85054 2022-11-23T02:48:20.0598758Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85055 2022-11-23T02:48:20.0599891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0600739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0601833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0602745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0603921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0604723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0605788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0606673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0607541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0608536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0609756Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0611149Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0612131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0613069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0615539Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0617176Z warnings.warn( 2022-11-23T02:48:20.0619463Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0620874Z warnings.warn( 2022-11-23T02:48:20.0621329Z dist init r=1, world=2 2022-11-23T02:48:20.0621785Z dist init r=0, world=2 2022-11-23T02:48:20.0622245Z ok (4.011s) 2022-11-23T02:48:20.0622855Z test_param_change_after_init_mixed_precision_True (__main__.TestParamInit) 2022-11-23T02:48:20.0624125Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85133 2022-11-23T02:48:20.0625202Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85134 2022-11-23T02:48:20.0626301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0627154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0628350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0629270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0630384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0631208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0632286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0633157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0634016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0634908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0636660Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0637983Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0638946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0639825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0641995Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0643322Z warnings.warn( 2022-11-23T02:48:20.0645264Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:48:20.0646577Z warnings.warn( 2022-11-23T02:48:20.0648830Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0650453Z warnings.warn( 2022-11-23T02:48:20.0652738Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0654196Z warnings.warn( 2022-11-23T02:48:20.0654662Z dist init r=1, world=2 2022-11-23T02:48:20.0655074Z dist init r=0, world=2 2022-11-23T02:48:20.0655542Z ok (3.911s) 2022-11-23T02:48:20.0656135Z test_delayed_optim_step_offload_false_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:48:20.0657122Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85212 2022-11-23T02:48:20.0658157Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85213 2022-11-23T02:48:20.0659423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0660284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0661311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0662172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0663278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0664140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0665186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0666058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0666901Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0667846Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0669140Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0670463Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0671404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0672331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0673277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0674216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0677099Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0678539Z warnings.warn( 2022-11-23T02:48:20.0680752Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0682404Z warnings.warn( 2022-11-23T02:48:20.0683110Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0684005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0684882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0685747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0686669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0687603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0688443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0689378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0690282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0691252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0692128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0693068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0693993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0694869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0695757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0696674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0697616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0698501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0701019Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.0702657Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.0705125Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.0706886Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.0707750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0708612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0709540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0710453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0711465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0712357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0713231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0714187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0715416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0716348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0717266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0718170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0718998Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0719866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0720753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0721596Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0722704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0723623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0724584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0725417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0726324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0727229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0728064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0728931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0730831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.0733200Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.0734605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0735472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0736389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0737273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0737906Z dist init r=1, world=2 2022-11-23T02:48:20.0738360Z dist init r=0, world=2 2022-11-23T02:48:20.0738813Z ok (17.230s) 2022-11-23T02:48:20.0739423Z test_delayed_optim_step_offload_false_none (__main__.TestParityWithDDP) 2022-11-23T02:48:20.0740396Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85295 2022-11-23T02:48:20.0741451Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85296 2022-11-23T02:48:20.0742641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0743664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0744759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0745616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0746711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0747576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0748606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0749551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0750391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0751320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0752576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0753971Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0754973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0756223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0757142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0758101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0760598Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0762095Z warnings.warn( 2022-11-23T02:48:20.0764296Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0765816Z warnings.warn( 2022-11-23T02:48:20.0766530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0767485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0768439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0769339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0770258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0771160Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0772021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0772940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0773897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0774967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0775828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0776747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0777657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0778616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0779549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0780386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0781262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0782167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0784118Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.0787249Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.0788950Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.0789794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0790693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0791595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0792543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0793459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0794325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0795678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0796603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0797502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0798379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0799272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0800125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0800947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0801865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0802723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0803651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0804510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0805421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0806465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0807348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0808227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0809164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0810063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0810860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0812806Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.0814292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0815264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0816293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0816960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0817353Z dist init r=1, world=2 2022-11-23T02:48:20.0817617Z dist init r=0, world=2 2022-11-23T02:48:20.0817846Z ok (28.747s) 2022-11-23T02:48:20.0818205Z test_delayed_optim_step_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:48:20.0818764Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85378 2022-11-23T02:48:20.0819286Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85379 2022-11-23T02:48:20.0819934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0820399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0820992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0821462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0822058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0822512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0823083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0823560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0824024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0824535Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0825189Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0825895Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0826427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0826909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0827384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0827878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0829171Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0830040Z warnings.warn( 2022-11-23T02:48:20.0831213Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0831970Z warnings.warn( 2022-11-23T02:48:20.0832355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0832845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0833335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0833854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0834344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0834823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0835495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0835967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0836446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0836928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0837405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0837866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0838337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0838812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0839268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0839736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0840205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0840679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0841693Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.0843233Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.0844120Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.0844577Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0845163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0845630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0846113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0846599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0847057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0847538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0848019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0848499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0848969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0849450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0849925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0850449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0850947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0851422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0851896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0852355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0852830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0853309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0853785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0854243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0854716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0855187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0855641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0856653Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.0857394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0857877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0858341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0858824Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0859188Z dist init r=1, world=2 2022-11-23T02:48:20.0859441Z dist init r=0, world=2 2022-11-23T02:48:20.0859664Z ok (28.748s) 2022-11-23T02:48:20.0860012Z test_delayed_optim_step_offload_true_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:48:20.0861168Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82490 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:48:20.0862020Z test_delayed_optim_step_offload_true_none (__main__.TestParityWithDDP) 2022-11-23T02:48:20.0862543Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85461 2022-11-23T02:48:20.0863084Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85462 2022-11-23T02:48:20.0863712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0864157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0864743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0865221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0865818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.0866255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.0866840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.0867362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.0867819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.0868323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.0868994Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0869697Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.0870217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.0870697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.0871173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0871666Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0872931Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0873764Z warnings.warn( 2022-11-23T02:48:20.0874946Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.0875965Z warnings.warn( 2022-11-23T02:48:20.0876243Z File "", line 1, in 2022-11-23T02:48:20.0876609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.0876987Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.0877364Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.0877725Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.0878118Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.0878566Z self.run() 2022-11-23T02:48:20.0878907Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.0879258Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.0879795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.0880197Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.0880716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.0881119Z getattr(self, test_name)() 2022-11-23T02:48:20.0881645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.0882004Z fn() 2022-11-23T02:48:20.0882502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.0882910Z test(self, **param_kwargs) 2022-11-23T02:48:20.0883433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.0883820Z return func(*args, **kwargs) 2022-11-23T02:48:20.0884231Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.0884680Z self.run_subtests( 2022-11-23T02:48:20.0885194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.0885629Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.0886191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.0886616Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.0887164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.0887577Z output = model(*input) 2022-11-23T02:48:20.0888065Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.0888440Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.0888998Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.0889464Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.0890040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.0890423Z _lazy_init(state, module) 2022-11-23T02:48:20.0890937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.0891348Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.0891858Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.0892260Z return func(*args, **kwargs) 2022-11-23T02:48:20.0892812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.0893199Z p_assert( 2022-11-23T02:48:20.0893663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.0894049Z traceback.print_stack() 2022-11-23T02:48:20.0894342Z File "", line 1, in 2022-11-23T02:48:20.0894704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.0895078Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.0895455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.0895813Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.0896207Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.0896620Z self.run() 2022-11-23T02:48:20.0896946Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.0897320Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.0897849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.0898245Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.0898762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.0899157Z getattr(self, test_name)() 2022-11-23T02:48:20.0899680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.0900031Z fn() 2022-11-23T02:48:20.0900526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.0900935Z test(self, **param_kwargs) 2022-11-23T02:48:20.0901460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.0901838Z return func(*args, **kwargs) 2022-11-23T02:48:20.0902248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.0902681Z self.run_subtests( 2022-11-23T02:48:20.0903189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.0903619Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.0904175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.0904609Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.0905158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.0905566Z output = model(*input) 2022-11-23T02:48:20.0906049Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.0906422Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.0906980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.0907439Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.0908011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.0908393Z _lazy_init(state, module) 2022-11-23T02:48:20.0908907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.0909319Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.0909828Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.0910211Z return func(*args, **kwargs) 2022-11-23T02:48:20.0910757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.0911146Z p_assert( 2022-11-23T02:48:20.0911608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.0911996Z traceback.print_stack() 2022-11-23T02:48:20.0912393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0912868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0913252Z File "", line 1, in 2022-11-23T02:48:20.0913626Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.0913981Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.0914463Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.0914840Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.0915410Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.0915737Z self.run() 2022-11-23T02:48:20.0916085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.0916454Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.0916969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.0917365Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.0917901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.0918282Z getattr(self, test_name)() 2022-11-23T02:48:20.0918809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.0919188Z fn() 2022-11-23T02:48:20.0919687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.0920071Z test(self, **param_kwargs) 2022-11-23T02:48:20.0920676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.0921089Z return func(*args, **kwargs) 2022-11-23T02:48:20.0921485Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.0921866Z self.run_subtests( 2022-11-23T02:48:20.0922382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.0922813Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.0923351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.0923782Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.0924344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.0924725Z output = model(*input) 2022-11-23T02:48:20.0925212Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.0925607Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.0926163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.0926609Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.0927182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.0927585Z _lazy_init(state, module) 2022-11-23T02:48:20.0928089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.0928499Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.0929026Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.0929413Z return func(*args, **kwargs) 2022-11-23T02:48:20.0929947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.0930336Z p_assert( 2022-11-23T02:48:20.0930812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.0931179Z traceback.print_stack() 2022-11-23T02:48:20.0931469Z File "", line 1, in 2022-11-23T02:48:20.0931848Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.0932291Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.0932667Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.0933045Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.0933437Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.0933761Z self.run() 2022-11-23T02:48:20.0934102Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.0934473Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.0934982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.0935373Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.0935905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.0936286Z getattr(self, test_name)() 2022-11-23T02:48:20.0936803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.0937179Z fn() 2022-11-23T02:48:20.0937684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.0938067Z test(self, **param_kwargs) 2022-11-23T02:48:20.0938639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.0939048Z return func(*args, **kwargs) 2022-11-23T02:48:20.0939440Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.0939821Z self.run_subtests( 2022-11-23T02:48:20.0940329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.0940742Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.0941303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.0941735Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.0942299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.0942681Z output = model(*input) 2022-11-23T02:48:20.0943172Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.0943563Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.0944104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.0944567Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.0945145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.0945550Z _lazy_init(state, module) 2022-11-23T02:48:20.0946046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.0946457Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.0946977Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.0947349Z return func(*args, **kwargs) 2022-11-23T02:48:20.0947893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.0948280Z p_assert( 2022-11-23T02:48:20.0948760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.0949129Z traceback.print_stack() 2022-11-23T02:48:20.0949527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0950020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.0950463Z File "", line 1, in 2022-11-23T02:48:20.0950843Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.0951225Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.0951602Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.0951967Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.0952358Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.0952696Z self.run() 2022-11-23T02:48:20.0953020Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.0953393Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.0953923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.0954300Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.0954839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.0955521Z getattr(self, test_name)() 2022-11-23T02:48:20.0956056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.0956410Z fn() 2022-11-23T02:48:20.0956983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.0957403Z test(self, **param_kwargs) 2022-11-23T02:48:20.0957915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.0958317Z return func(*args, **kwargs) 2022-11-23T02:48:20.0958728Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.0959089Z self.run_subtests( 2022-11-23T02:48:20.0959605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.0960037Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.0960593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.0961011Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.0961583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.0961992Z output = model(*input) 2022-11-23T02:48:20.0962459Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.0962853Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.0963409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.0963863Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.0964442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.0964846Z _lazy_init(state, module) 2022-11-23T02:48:20.0965340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1044335Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1045121Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1045546Z return func(*args, **kwargs) 2022-11-23T02:48:20.1046115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1046536Z p_assert( 2022-11-23T02:48:20.1047048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1047680Z traceback.print_stack() 2022-11-23T02:48:20.1047985Z File "", line 1, in 2022-11-23T02:48:20.1048383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1048768Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1049169Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1049576Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1049995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1050338Z self.run() 2022-11-23T02:48:20.1050695Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1051097Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1051646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1052068Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1052652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1053060Z getattr(self, test_name)() 2022-11-23T02:48:20.1053622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1054023Z fn() 2022-11-23T02:48:20.1054649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1055080Z test(self, **param_kwargs) 2022-11-23T02:48:20.1055638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1056066Z return func(*args, **kwargs) 2022-11-23T02:48:20.1056487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1056891Z self.run_subtests( 2022-11-23T02:48:20.1057437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1057899Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1058474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1058938Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1059541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1059950Z output = model(*input) 2022-11-23T02:48:20.1060471Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1060894Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1061468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1061967Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1062578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1063007Z _lazy_init(state, module) 2022-11-23T02:48:20.1063539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1063980Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1064539Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1064936Z return func(*args, **kwargs) 2022-11-23T02:48:20.1065518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1065935Z p_assert( 2022-11-23T02:48:20.1066447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1066926Z traceback.print_stack() 2022-11-23T02:48:20.1067346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1067869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1068260Z File "", line 1, in 2022-11-23T02:48:20.1068658Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1069060Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1069460Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1069846Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1070264Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1070623Z self.run() 2022-11-23T02:48:20.1070964Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1071374Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1071934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1072338Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1072914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1073464Z getattr(self, test_name)() 2022-11-23T02:48:20.1074038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1074416Z fn() 2022-11-23T02:48:20.1074947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1075579Z test(self, **param_kwargs) 2022-11-23T02:48:20.1076125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1076563Z return func(*args, **kwargs) 2022-11-23T02:48:20.1076997Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1077406Z self.run_subtests( 2022-11-23T02:48:20.1077936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1078407Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1079010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1079452Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1080054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1080483Z output = model(*input) 2022-11-23T02:48:20.1080983Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1081409Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1082003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1082495Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1083094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1083525Z _lazy_init(state, module) 2022-11-23T02:48:20.1084074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1084514Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1085049Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1085463Z return func(*args, **kwargs) 2022-11-23T02:48:20.1086043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1086543Z p_assert( 2022-11-23T02:48:20.1087054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1087466Z traceback.print_stack() 2022-11-23T02:48:20.1087750Z File "", line 1, in 2022-11-23T02:48:20.1088155Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1088556Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1088958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1089347Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1089770Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1090131Z self.run() 2022-11-23T02:48:20.1090475Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1090877Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1091436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1091842Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1092412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1092916Z getattr(self, test_name)() 2022-11-23T02:48:20.1093492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1093872Z fn() 2022-11-23T02:48:20.1094402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1094829Z test(self, **param_kwargs) 2022-11-23T02:48:20.1095372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1095809Z return func(*args, **kwargs) 2022-11-23T02:48:20.1096241Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1096628Z self.run_subtests( 2022-11-23T02:48:20.1097174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1097638Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1098239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1098685Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1099286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1099717Z output = model(*input) 2022-11-23T02:48:20.1100218Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1100654Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1101252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1101747Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1102347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1102772Z _lazy_init(state, module) 2022-11-23T02:48:20.1103321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1103739Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1104298Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1104711Z return func(*args, **kwargs) 2022-11-23T02:48:20.1105298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1105769Z p_assert( 2022-11-23T02:48:20.1106287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1106701Z traceback.print_stack() 2022-11-23T02:48:20.1107108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1107627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1108033Z File "", line 1, in 2022-11-23T02:48:20.1108427Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1108811Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1109209Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1109610Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1110014Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1110372Z self.run() 2022-11-23T02:48:20.1110733Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1111115Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1111734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1112166Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1112474Z File "", line 1, in 2022-11-23T02:48:20.1113030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1113460Z getattr(self, test_name)() 2022-11-23T02:48:20.1114026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1114407Z fn() 2022-11-23T02:48:20.1114770Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1115413Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1115989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1116419Z test(self, **param_kwargs) 2022-11-23T02:48:20.1116811Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1117217Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1117791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1118222Z return func(*args, **kwargs) 2022-11-23T02:48:20.1118613Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1118957Z self.run() 2022-11-23T02:48:20.1119368Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1119780Z self.run_subtests( 2022-11-23T02:48:20.1120140Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1120539Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1121112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1121580Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1122138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1122557Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1123131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1123578Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1124168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1124730Z getattr(self, test_name)() 2022-11-23T02:48:20.1125312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1125725Z output = model(*input) 2022-11-23T02:48:20.1126281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1126684Z fn() 2022-11-23T02:48:20.1127157Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1127582Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1128163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1128589Z test(self, **param_kwargs) 2022-11-23T02:48:20.1129143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1129648Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1130252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1130663Z return func(*args, **kwargs) 2022-11-23T02:48:20.1131307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1131746Z _lazy_init(state, module) 2022-11-23T02:48:20.1132178Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1132563Z self.run_subtests( 2022-11-23T02:48:20.1133105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1133550Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1134103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1134576Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1135143Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1135561Z return func(*args, **kwargs) 2022-11-23T02:48:20.1136121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1136582Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1137187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1137583Z p_assert( 2022-11-23T02:48:20.1138132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1138564Z output = model(*input) 2022-11-23T02:48:20.1139089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1139503Z traceback.print_stack() 2022-11-23T02:48:20.1140026Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1140446Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1141024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1141513Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1142127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1142537Z _lazy_init(state, module) 2022-11-23T02:48:20.1143084Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1143519Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1144153Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1144555Z return func(*args, **kwargs) 2022-11-23T02:48:20.1145139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1145557Z p_assert( 2022-11-23T02:48:20.1146054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1146466Z traceback.print_stack() 2022-11-23T02:48:20.1146881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1147406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1147792Z File "", line 1, in 2022-11-23T02:48:20.1148185Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1148593Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1148975Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1149364Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1149780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1150123Z self.run() 2022-11-23T02:48:20.1150539Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1150948Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1151504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1151904Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1152462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1152897Z getattr(self, test_name)() 2022-11-23T02:48:20.1153442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1153833Z fn() 2022-11-23T02:48:20.1154355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1154769Z test(self, **param_kwargs) 2022-11-23T02:48:20.1155599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1156019Z return func(*args, **kwargs) 2022-11-23T02:48:20.1156444Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1156830Z self.run_subtests( 2022-11-23T02:48:20.1157365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1157823Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1158412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1158868Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1159467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1159885Z output = model(*input) 2022-11-23T02:48:20.1160380Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1160795Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1161378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1161851Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1162455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1162981Z _lazy_init(state, module) 2022-11-23T02:48:20.1163528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1163939Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1164484Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1164893Z return func(*args, **kwargs) 2022-11-23T02:48:20.1165455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1165859Z p_assert( 2022-11-23T02:48:20.1166364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1166765Z traceback.print_stack() 2022-11-23T02:48:20.1167048Z File "", line 1, in 2022-11-23T02:48:20.1167431Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1167824Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1168207Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1168603Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1169001Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1169337Z self.run() 2022-11-23T02:48:20.1169765Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1170170Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1170710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1171120Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1171679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1172104Z getattr(self, test_name)() 2022-11-23T02:48:20.1172652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1173033Z fn() 2022-11-23T02:48:20.1173607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1174007Z test(self, **param_kwargs) 2022-11-23T02:48:20.1174560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1174984Z return func(*args, **kwargs) 2022-11-23T02:48:20.1175414Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1175797Z self.run_subtests( 2022-11-23T02:48:20.1176333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1176785Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1177366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1177815Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1178412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1178840Z output = model(*input) 2022-11-23T02:48:20.1179337Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1179751Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1180338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1180819Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1181431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1181954Z _lazy_init(state, module) 2022-11-23T02:48:20.1182506Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1182926Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1183295Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1183434Z return func(*args, **kwargs) 2022-11-23T02:48:20.1183845Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1183958Z p_assert( 2022-11-23T02:48:20.1184332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1184464Z traceback.print_stack() 2022-11-23T02:48:20.1184705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1184961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1185092Z File "", line 1, in 2022-11-23T02:48:20.1185313Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1185463Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1185734Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1185911Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1186125Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1186226Z self.run() 2022-11-23T02:48:20.1186445Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1186594Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1186973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1187121Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1187522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1187656Z getattr(self, test_name)() 2022-11-23T02:48:20.1188035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1188137Z fn() 2022-11-23T02:48:20.1188539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1188673Z test(self, **param_kwargs) 2022-11-23T02:48:20.1189060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1189193Z return func(*args, **kwargs) 2022-11-23T02:48:20.1189454Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1189576Z self.run_subtests( 2022-11-23T02:48:20.1189947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1190121Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1190518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1190687Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1191103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1191234Z output = model(*input) 2022-11-23T02:48:20.1191586Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1191736Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1192129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1192391Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1192799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1192922Z _lazy_init(state, module) 2022-11-23T02:48:20.1193313Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1193467Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1193837Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1193972Z return func(*args, **kwargs) 2022-11-23T02:48:20.1194370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1194478Z p_assert( 2022-11-23T02:48:20.1194843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1194984Z traceback.print_stack() 2022-11-23T02:48:20.1195368Z File "", line 1, in 2022-11-23T02:48:20.1195603Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1195752Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1196054Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1196213Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1196441Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1196550Z self.run() 2022-11-23T02:48:20.1196766Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1196921Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1197309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1197452Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1197838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1197975Z getattr(self, test_name)() 2022-11-23T02:48:20.1198361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1198471Z fn() 2022-11-23T02:48:20.1198870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1198997Z test(self, **param_kwargs) 2022-11-23T02:48:20.1199385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1199512Z return func(*args, **kwargs) 2022-11-23T02:48:20.1199767Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1199891Z self.run_subtests( 2022-11-23T02:48:20.1200281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1200451Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1200854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1201020Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1201432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1201554Z output = model(*input) 2022-11-23T02:48:20.1201897Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1202049Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1202456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1202731Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1203136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1203264Z _lazy_init(state, module) 2022-11-23T02:48:20.1203653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1203802Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1204155Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1204292Z return func(*args, **kwargs) 2022-11-23T02:48:20.1204698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1204800Z p_assert( 2022-11-23T02:48:20.1205168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1205308Z traceback.print_stack() 2022-11-23T02:48:20.1205560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1205803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1205924Z File "", line 1, in 2022-11-23T02:48:20.1206209Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1206366Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1206587Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1206740Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1206971Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1207081Z self.run() 2022-11-23T02:48:20.1207293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1207438Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1207819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1207958Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1208356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1208489Z getattr(self, test_name)() 2022-11-23T02:48:20.1208876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1208982Z fn() 2022-11-23T02:48:20.1209369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1209497Z test(self, **param_kwargs) 2022-11-23T02:48:20.1209891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1210025Z return func(*args, **kwargs) 2022-11-23T02:48:20.1210294Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1210409Z self.run_subtests( 2022-11-23T02:48:20.1210794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1210973Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1211358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1211514Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1211925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1212053Z output = model(*input) 2022-11-23T02:48:20.1212410Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1212624Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1213040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1213228Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1213634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1213749Z _lazy_init(state, module) 2022-11-23T02:48:20.1214135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1214280Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1214644Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1214775Z return func(*args, **kwargs) 2022-11-23T02:48:20.1215198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1215306Z p_assert( 2022-11-23T02:48:20.1215663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1215792Z traceback.print_stack() 2022-11-23T02:48:20.1215929Z File "", line 1, in 2022-11-23T02:48:20.1216202Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1216365Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1216580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1216742Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1216971Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1217063Z self.run() 2022-11-23T02:48:20.1217278Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1217435Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1217807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1217946Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1218343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1218476Z getattr(self, test_name)() 2022-11-23T02:48:20.1218871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1218958Z fn() 2022-11-23T02:48:20.1219354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1219486Z test(self, **param_kwargs) 2022-11-23T02:48:20.1219874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1220012Z return func(*args, **kwargs) 2022-11-23T02:48:20.1220280Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1220401Z self.run_subtests( 2022-11-23T02:48:20.1220776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1220950Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1221352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1221516Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1221924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1222050Z output = model(*input) 2022-11-23T02:48:20.1222407Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1222640Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1223054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1223225Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1223635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1223764Z _lazy_init(state, module) 2022-11-23T02:48:20.1224150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1224296Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1224665Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1224800Z return func(*args, **kwargs) 2022-11-23T02:48:20.1225218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1225309Z p_assert( 2022-11-23T02:48:20.1225680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1225814Z traceback.print_stack() 2022-11-23T02:48:20.1226114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1226374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1226511Z File "", line 1, in 2022-11-23T02:48:20.1226740Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1226872Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1227006Z File "", line 1, in 2022-11-23T02:48:20.1227219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1227376Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1227595Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1227703Z self.run() 2022-11-23T02:48:20.1227916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1228063Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1228257Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1228401Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1228604Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1228752Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1229112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1229245Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1229464Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1229554Z self.run() 2022-11-23T02:48:20.1229932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1230057Z getattr(self, test_name)() 2022-11-23T02:48:20.1230260Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1230411Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1230776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1230878Z fn() 2022-11-23T02:48:20.1231218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1231339Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1231713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1231901Z test(self, **param_kwargs) 2022-11-23T02:48:20.1232272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1232396Z getattr(self, test_name)() 2022-11-23T02:48:20.1232763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1232897Z return func(*args, **kwargs) 2022-11-23T02:48:20.1233264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1233348Z fn() 2022-11-23T02:48:20.1233596Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1233714Z self.run_subtests( 2022-11-23T02:48:20.1234084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1234210Z test(self, **param_kwargs) 2022-11-23T02:48:20.1234571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1234741Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1235413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1235549Z return func(*args, **kwargs) 2022-11-23T02:48:20.1235931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1236083Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1236333Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1236451Z self.run_subtests( 2022-11-23T02:48:20.1236833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1236965Z output = model(*input) 2022-11-23T02:48:20.1237325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1237473Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1237811Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1237954Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1238328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1238483Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1238860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1239040Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1239428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1239535Z output = model(*input) 2022-11-23T02:48:20.1239914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1240037Z _lazy_init(state, module) 2022-11-23T02:48:20.1240375Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1240520Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1240875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1241020Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1241404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1241650Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1242001Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1242127Z return func(*args, **kwargs) 2022-11-23T02:48:20.1242502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1242632Z _lazy_init(state, module) 2022-11-23T02:48:20.1243021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1243127Z p_assert( 2022-11-23T02:48:20.1243486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1243616Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1243960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1244094Z traceback.print_stack() 2022-11-23T02:48:20.1244443Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1244570Z return func(*args, **kwargs) 2022-11-23T02:48:20.1244954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1245114Z p_assert( 2022-11-23T02:48:20.1245472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1245586Z traceback.print_stack() 2022-11-23T02:48:20.1245830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1246070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1246202Z File "", line 1, in 2022-11-23T02:48:20.1246419Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1246568Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1246772Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1246925Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1247125Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1247237Z self.run() 2022-11-23T02:48:20.1247442Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1247589Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1247937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1248069Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1248439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1248567Z getattr(self, test_name)() 2022-11-23T02:48:20.1248920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1249020Z fn() 2022-11-23T02:48:20.1249388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1249509Z test(self, **param_kwargs) 2022-11-23T02:48:20.1249867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1249998Z return func(*args, **kwargs) 2022-11-23T02:48:20.1250248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1250348Z self.run_subtests( 2022-11-23T02:48:20.1250710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1250875Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1251319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1251478Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1251857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1251985Z output = model(*input) 2022-11-23T02:48:20.1252323Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1252450Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1252834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1253013Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1253389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1253519Z _lazy_init(state, module) 2022-11-23T02:48:20.1253881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1254028Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1254425Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1254566Z return func(*args, **kwargs) 2022-11-23T02:48:20.1254943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1255047Z p_assert( 2022-11-23T02:48:20.1255386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1255516Z traceback.print_stack() 2022-11-23T02:48:20.1255646Z File "", line 1, in 2022-11-23T02:48:20.1255862Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1256016Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1256206Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1256362Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1256579Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1256691Z self.run() 2022-11-23T02:48:20.1256898Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1257047Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1257393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1257532Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1257884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1258013Z getattr(self, test_name)() 2022-11-23T02:48:20.1258379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1258477Z fn() 2022-11-23T02:48:20.1258847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1258974Z test(self, **param_kwargs) 2022-11-23T02:48:20.1259343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1259471Z return func(*args, **kwargs) 2022-11-23T02:48:20.1259708Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1259825Z self.run_subtests( 2022-11-23T02:48:20.1260185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1260349Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1260790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1260946Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1261328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1261457Z output = model(*input) 2022-11-23T02:48:20.1261773Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1261918Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1262304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1262486Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1262859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1262989Z _lazy_init(state, module) 2022-11-23T02:48:20.1263345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1263491Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1263865Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1264005Z return func(*args, **kwargs) 2022-11-23T02:48:20.1264396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1264500Z p_assert( 2022-11-23T02:48:20.1264841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1264969Z traceback.print_stack() 2022-11-23T02:48:20.1265211Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1265464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1265581Z File "", line 1, in 2022-11-23T02:48:20.1265714Z File "", line 1, in 2022-11-23T02:48:20.1265927Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1266077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1266280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1266438Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1266649Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1266778Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1266992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1267100Z self.run() 2022-11-23T02:48:20.1267302Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1267462Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1267667Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1267815Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1268026Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1268119Z self.run() 2022-11-23T02:48:20.1268472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1268609Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1268814Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1268963Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1269327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1269519Z getattr(self, test_name)() 2022-11-23T02:48:20.1269851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1269986Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1270354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1270457Z fn() 2022-11-23T02:48:20.1270828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1270956Z getattr(self, test_name)() 2022-11-23T02:48:20.1271332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1271457Z test(self, **param_kwargs) 2022-11-23T02:48:20.1271805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1271910Z fn() 2022-11-23T02:48:20.1272277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1272404Z return func(*args, **kwargs) 2022-11-23T02:48:20.1272846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1272966Z test(self, **param_kwargs) 2022-11-23T02:48:20.1273276Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1273436Z self.run_subtests( 2022-11-23T02:48:20.1273789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1273917Z return func(*args, **kwargs) 2022-11-23T02:48:20.1274276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1274442Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1274703Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1274821Z self.run_subtests( 2022-11-23T02:48:20.1275409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1275577Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1275925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1276091Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1276475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1291617Z output = model(*input) 2022-11-23T02:48:20.1292017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1292181Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1292515Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1292655Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1293040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1293147Z output = model(*input) 2022-11-23T02:48:20.1293531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1293710Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1294041Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1294185Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1294558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1294826Z _lazy_init(state, module) 2022-11-23T02:48:20.1295213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1295375Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1295735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1295883Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1296257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1296380Z _lazy_init(state, module) 2022-11-23T02:48:20.1296727Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1296855Z return func(*args, **kwargs) 2022-11-23T02:48:20.1297220Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1297348Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1297735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1297840Z p_assert( 2022-11-23T02:48:20.1298257Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1298396Z return func(*args, **kwargs) 2022-11-23T02:48:20.1298743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1298874Z traceback.print_stack() 2022-11-23T02:48:20.1299256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1299342Z p_assert( 2022-11-23T02:48:20.1299682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1299817Z traceback.print_stack() 2022-11-23T02:48:20.1300057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1300287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1300424Z File "", line 1, in 2022-11-23T02:48:20.1300638Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1300767Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1300972Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1301121Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1301335Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1301440Z self.run() 2022-11-23T02:48:20.1301641Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1301795Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1302144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1302263Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1302635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1302760Z getattr(self, test_name)() 2022-11-23T02:48:20.1303126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1303228Z fn() 2022-11-23T02:48:20.1303601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1303726Z test(self, **param_kwargs) 2022-11-23T02:48:20.1304086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1304262Z return func(*args, **kwargs) 2022-11-23T02:48:20.1304520Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1304635Z self.run_subtests( 2022-11-23T02:48:20.1305005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1305168Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1305536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1305693Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1306073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1306180Z output = model(*input) 2022-11-23T02:48:20.1306517Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1306662Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1307041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1307218Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1307646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1307778Z _lazy_init(state, module) 2022-11-23T02:48:20.1308143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1308272Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1308616Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1308742Z return func(*args, **kwargs) 2022-11-23T02:48:20.1309136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1309244Z p_assert( 2022-11-23T02:48:20.1309584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1309713Z traceback.print_stack() 2022-11-23T02:48:20.1309849Z File "", line 1, in 2022-11-23T02:48:20.1310047Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1310191Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1310396Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1310550Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1310767Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1310874Z self.run() 2022-11-23T02:48:20.1311084Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1311217Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1311570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1311706Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1312078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1312200Z getattr(self, test_name)() 2022-11-23T02:48:20.1312570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1312670Z fn() 2022-11-23T02:48:20.1313041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1313150Z test(self, **param_kwargs) 2022-11-23T02:48:20.1313516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1313708Z return func(*args, **kwargs) 2022-11-23T02:48:20.1313959Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1314075Z self.run_subtests( 2022-11-23T02:48:20.1314442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1314613Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1314985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1315328Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1315726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1315849Z output = model(*input) 2022-11-23T02:48:20.1316188Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1316331Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1316714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1316971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1317367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1317476Z _lazy_init(state, module) 2022-11-23T02:48:20.1317837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1317984Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1318331Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1318463Z return func(*args, **kwargs) 2022-11-23T02:48:20.1318849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1318955Z p_assert( 2022-11-23T02:48:20.1319300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1319411Z traceback.print_stack() 2022-11-23T02:48:20.1319657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1319895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1320126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1320362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1320598Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1320835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1321063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1321271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1321509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1321745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1321980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1322206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1323229Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1323551Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:48:20.1324573Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1324811Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:48:20.1325049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1325288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1325524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1325741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1326027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1326269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1326498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1326729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1326958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1327194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1327423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1327637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1327874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1328106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1328332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1328554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1328779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1329009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1329240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1329450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1329674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1329904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1330135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1330358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1330580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1330803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1331028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1331323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1331535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1331760Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1331993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1332222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1332449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1332677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1332902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1333125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1333231Z dist init r=1, world=2 2022-11-23T02:48:20.1333572Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1333945Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1334286Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1334610Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1334922Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1335259Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1335584Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1335900Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1336221Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1336535Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1336857Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1337158Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1337275Z dist init r=0, world=2 2022-11-23T02:48:20.1337588Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1337895Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1338206Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1338581Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1338890Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1339213Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1339527Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1339834Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1340163Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1340471Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1340889Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1341222Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1341329Z ok (35.262s) 2022-11-23T02:48:20.1341550Z test_delayed_optim_step_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1341868Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85544 2022-11-23T02:48:20.1342099Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85545 2022-11-23T02:48:20.1342494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1342678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1343054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1343254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1343630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1343809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1344187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1344381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1344638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.1344887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.1345302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1345693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1345928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.1346156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.1346397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1346629Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1347730Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1347850Z warnings.warn( 2022-11-23T02:48:20.1348859Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1348979Z warnings.warn( 2022-11-23T02:48:20.1349112Z File "", line 1, in 2022-11-23T02:48:20.1349330Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1349459Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1349717Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1349882Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1350104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1350211Z self.run() 2022-11-23T02:48:20.1350416Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1350567Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1350904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1351046Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1351174Z File "", line 1, in 2022-11-23T02:48:20.1351549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1351676Z getattr(self, test_name)() 2022-11-23T02:48:20.1352053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1352157Z fn() 2022-11-23T02:48:20.1352371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1352498Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1352874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1353002Z test(self, **param_kwargs) 2022-11-23T02:48:20.1353205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1353364Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1353732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1353861Z return func(*args, **kwargs) 2022-11-23T02:48:20.1354078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1354171Z self.run() 2022-11-23T02:48:20.1354428Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1354545Z self.run_subtests( 2022-11-23T02:48:20.1354751Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1354901Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1355543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1355712Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1356159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1356299Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1356670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1356830Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1357198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1357326Z getattr(self, test_name)() 2022-11-23T02:48:20.1357705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1357829Z output = model(*input) 2022-11-23T02:48:20.1358178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1358287Z fn() 2022-11-23T02:48:20.1358621Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1358765Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1359141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1359332Z test(self, **param_kwargs) 2022-11-23T02:48:20.1359733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1359917Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1360286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1360397Z return func(*args, **kwargs) 2022-11-23T02:48:20.1360771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1360904Z _lazy_init(state, module) 2022-11-23T02:48:20.1361162Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1361280Z self.run_subtests( 2022-11-23T02:48:20.1361640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1361790Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1362139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1362308Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1362654Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1362783Z return func(*args, **kwargs) 2022-11-23T02:48:20.1363157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1363319Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1363705Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1363812Z p_assert( 2022-11-23T02:48:20.1364192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1364300Z output = model(*input) 2022-11-23T02:48:20.1364644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1364773Z traceback.print_stack() 2022-11-23T02:48:20.1365108Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1365254Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1365633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1365880Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1366258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1366364Z _lazy_init(state, module) 2022-11-23T02:48:20.1366729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1366877Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1367223Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1367350Z return func(*args, **kwargs) 2022-11-23T02:48:20.1367734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1367838Z p_assert( 2022-11-23T02:48:20.1368162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1368296Z traceback.print_stack() 2022-11-23T02:48:20.1368537Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1368778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1368964Z File "", line 1, in 2022-11-23T02:48:20.1369192Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1369337Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1369543Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1369682Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1369902Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1370010Z self.run() 2022-11-23T02:48:20.1370216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1370372Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1370727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1370864Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1371235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1371348Z getattr(self, test_name)() 2022-11-23T02:48:20.1371714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1371815Z fn() 2022-11-23T02:48:20.1372184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1372310Z test(self, **param_kwargs) 2022-11-23T02:48:20.1372674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1372809Z return func(*args, **kwargs) 2022-11-23T02:48:20.1373064Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1373163Z self.run_subtests( 2022-11-23T02:48:20.1373575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1373745Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1374117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1374273Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1374653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1374777Z output = model(*input) 2022-11-23T02:48:20.1375108Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1375306Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1375697Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1375879Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1376256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1376382Z _lazy_init(state, module) 2022-11-23T02:48:20.1376739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1376885Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1377228Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1377340Z return func(*args, **kwargs) 2022-11-23T02:48:20.1377732Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1377838Z p_assert( 2022-11-23T02:48:20.1378180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1378310Z traceback.print_stack() 2022-11-23T02:48:20.1378501Z File "", line 1, in 2022-11-23T02:48:20.1378728Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1378856Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1379064Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1379217Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1379433Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1379540Z self.run() 2022-11-23T02:48:20.1379743Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1379898Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1380255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1380374Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1380748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1380877Z getattr(self, test_name)() 2022-11-23T02:48:20.1381243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1381347Z fn() 2022-11-23T02:48:20.1381719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1381844Z test(self, **param_kwargs) 2022-11-23T02:48:20.1382207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1382324Z return func(*args, **kwargs) 2022-11-23T02:48:20.1382578Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1382696Z self.run_subtests( 2022-11-23T02:48:20.1383060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1383227Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1383599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1383754Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1384138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1384246Z output = model(*input) 2022-11-23T02:48:20.1384651Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1384798Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1385183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1385361Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1385741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1385865Z _lazy_init(state, module) 2022-11-23T02:48:20.1386227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1386358Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1386704Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1386831Z return func(*args, **kwargs) 2022-11-23T02:48:20.1387226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1387334Z p_assert( 2022-11-23T02:48:20.1387674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1387804Z traceback.print_stack() 2022-11-23T02:48:20.1388097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1388333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1388466Z File "", line 1, in 2022-11-23T02:48:20.1388597Z File "", line 1, in 2022-11-23T02:48:20.1388812Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1388956Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1389161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1389321Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1389519Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1389661Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1389879Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1389995Z self.run() 2022-11-23T02:48:20.1390201Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1390355Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1390561Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1390711Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1390911Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1391016Z self.run() 2022-11-23T02:48:20.1391376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1391517Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1391723Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1391871Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1392243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1392373Z getattr(self, test_name)() 2022-11-23T02:48:20.1392702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1392838Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1393204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1393303Z fn() 2022-11-23T02:48:20.1393672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1393867Z getattr(self, test_name)() 2022-11-23T02:48:20.1394245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1394372Z test(self, **param_kwargs) 2022-11-23T02:48:20.1394720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1394821Z fn() 2022-11-23T02:48:20.1395419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1395556Z return func(*args, **kwargs) 2022-11-23T02:48:20.1395937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1396063Z test(self, **param_kwargs) 2022-11-23T02:48:20.1396316Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1396420Z self.run_subtests( 2022-11-23T02:48:20.1396787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1396916Z return func(*args, **kwargs) 2022-11-23T02:48:20.1397353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1397533Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1397789Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1397906Z self.run_subtests( 2022-11-23T02:48:20.1398283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1398422Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1398778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1398948Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1399334Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1399459Z output = model(*input) 2022-11-23T02:48:20.1399832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1399988Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1400320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1400447Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1400828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1400949Z output = model(*input) 2022-11-23T02:48:20.1401340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1401523Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1401858Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1402010Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1402386Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1402512Z _lazy_init(state, module) 2022-11-23T02:48:20.1402876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1403055Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1403412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1403642Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1404020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1404147Z _lazy_init(state, module) 2022-11-23T02:48:20.1404492Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1404619Z return func(*args, **kwargs) 2022-11-23T02:48:20.1404957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1405102Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1405489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1405595Z p_assert( 2022-11-23T02:48:20.1405938Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1406072Z return func(*args, **kwargs) 2022-11-23T02:48:20.1406419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1406548Z traceback.print_stack() 2022-11-23T02:48:20.1406961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1407083Z p_assert( 2022-11-23T02:48:20.1407422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1407551Z traceback.print_stack() 2022-11-23T02:48:20.1407794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1408033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1408168Z File "", line 1, in 2022-11-23T02:48:20.1408365Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1408519Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1408725Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1408880Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1409096Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1409208Z self.run() 2022-11-23T02:48:20.1409415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1409566Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1409902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1410040Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1410414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1410549Z getattr(self, test_name)() 2022-11-23T02:48:20.1410917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1411021Z fn() 2022-11-23T02:48:20.1411390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1411519Z test(self, **param_kwargs) 2022-11-23T02:48:20.1411863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1411992Z return func(*args, **kwargs) 2022-11-23T02:48:20.1412247Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1412364Z self.run_subtests( 2022-11-23T02:48:20.1412724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1412888Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1413327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1413485Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1413850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1413977Z output = model(*input) 2022-11-23T02:48:20.1414309Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1414454Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1414838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1415017Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1415393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1415522Z _lazy_init(state, module) 2022-11-23T02:48:20.1415863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1416008Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1416403Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1416542Z return func(*args, **kwargs) 2022-11-23T02:48:20.1416932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1417040Z p_assert( 2022-11-23T02:48:20.1417382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1417513Z traceback.print_stack() 2022-11-23T02:48:20.1417629Z File "", line 1, in 2022-11-23T02:48:20.1417845Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1417996Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1418204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1418356Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1418577Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1418684Z self.run() 2022-11-23T02:48:20.1418870Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1419019Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1419368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1419505Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1419872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1420003Z getattr(self, test_name)() 2022-11-23T02:48:20.1420369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1420472Z fn() 2022-11-23T02:48:20.1420826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1420958Z test(self, **param_kwargs) 2022-11-23T02:48:20.1421323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1421452Z return func(*args, **kwargs) 2022-11-23T02:48:20.1421702Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1421819Z self.run_subtests( 2022-11-23T02:48:20.1422178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1422422Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1422782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1422940Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1423323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1423448Z output = model(*input) 2022-11-23T02:48:20.1423779Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1423924Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1424312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1424493Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1424851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1424981Z _lazy_init(state, module) 2022-11-23T02:48:20.1425344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1425489Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1425884Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1426020Z return func(*args, **kwargs) 2022-11-23T02:48:20.1426409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1426517Z p_assert( 2022-11-23T02:48:20.1426841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1426974Z traceback.print_stack() 2022-11-23T02:48:20.1427212Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1427458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1427589Z File "", line 1, in 2022-11-23T02:48:20.1427719Z File "", line 1, in 2022-11-23T02:48:20.1427932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1428080Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1428270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1428423Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1428634Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1428778Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1428992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1429099Z self.run() 2022-11-23T02:48:20.1429308Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1429443Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1429648Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1429796Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1430015Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1430122Z self.run() 2022-11-23T02:48:20.1430477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1430613Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1430819Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1430950Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1431321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1431518Z getattr(self, test_name)() 2022-11-23T02:48:20.1431868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1432005Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1432371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1432477Z fn() 2022-11-23T02:48:20.1432849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1432958Z getattr(self, test_name)() 2022-11-23T02:48:20.1433332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1433457Z test(self, **param_kwargs) 2022-11-23T02:48:20.1433822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1433926Z fn() 2022-11-23T02:48:20.1434290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1434420Z return func(*args, **kwargs) 2022-11-23T02:48:20.1434773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1434952Z test(self, **param_kwargs) 2022-11-23T02:48:20.1435442Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1435565Z self.run_subtests( 2022-11-23T02:48:20.1435943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1436073Z return func(*args, **kwargs) 2022-11-23T02:48:20.1436433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1436606Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1436844Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1436962Z self.run_subtests( 2022-11-23T02:48:20.1437332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1437492Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1437854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1438019Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1438403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1438527Z output = model(*input) 2022-11-23T02:48:20.1438896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1439040Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1439374Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1439519Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1439907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1440031Z output = model(*input) 2022-11-23T02:48:20.1440419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1440599Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1440934Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1441062Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1441437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1441656Z _lazy_init(state, module) 2022-11-23T02:48:20.1442042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1442220Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1442583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1442734Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1443110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1443219Z _lazy_init(state, module) 2022-11-23T02:48:20.1443565Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1443693Z return func(*args, **kwargs) 2022-11-23T02:48:20.1444054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1444200Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1444590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1444698Z p_assert( 2022-11-23T02:48:20.1445111Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1445235Z return func(*args, **kwargs) 2022-11-23T02:48:20.1445582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1445715Z traceback.print_stack() 2022-11-23T02:48:20.1446093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1446198Z p_assert( 2022-11-23T02:48:20.1446542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1446674Z traceback.print_stack() 2022-11-23T02:48:20.1446902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1447146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1447284Z File "", line 1, in 2022-11-23T02:48:20.1447500Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1447645Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1447850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1448005Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1448223Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1448313Z self.run() 2022-11-23T02:48:20.1448522Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1448673Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1449025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1449162Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1449533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1449664Z getattr(self, test_name)() 2022-11-23T02:48:20.1450031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1450117Z fn() 2022-11-23T02:48:20.1450492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1450621Z test(self, **param_kwargs) 2022-11-23T02:48:20.1450982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1451173Z return func(*args, **kwargs) 2022-11-23T02:48:20.1451430Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1451546Z self.run_subtests( 2022-11-23T02:48:20.1451915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1452066Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1452441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1452600Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1452984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1453106Z output = model(*input) 2022-11-23T02:48:20.1453445Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1453592Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1453978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1454194Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1454587Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1454711Z _lazy_init(state, module) 2022-11-23T02:48:20.1455067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1455213Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1455556Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1455689Z return func(*args, **kwargs) 2022-11-23T02:48:20.1456073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1456162Z p_assert( 2022-11-23T02:48:20.1456503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1456631Z traceback.print_stack() 2022-11-23T02:48:20.1456767Z File "", line 1, in 2022-11-23T02:48:20.1456984Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1457133Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1457339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1457475Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1457693Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1457798Z self.run() 2022-11-23T02:48:20.1458007Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1458158Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1458513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1458653Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1459026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1459136Z getattr(self, test_name)() 2022-11-23T02:48:20.1459498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1459600Z fn() 2022-11-23T02:48:20.1459974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1460101Z test(self, **param_kwargs) 2022-11-23T02:48:20.1460465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1460661Z return func(*args, **kwargs) 2022-11-23T02:48:20.1460918Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1461017Z self.run_subtests( 2022-11-23T02:48:20.1461380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1461546Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1461918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1462073Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1462456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1462578Z output = model(*input) 2022-11-23T02:48:20.1462918Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1463046Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1463430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1463666Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1464055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1464179Z _lazy_init(state, module) 2022-11-23T02:48:20.1464538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1464683Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1465028Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1465147Z return func(*args, **kwargs) 2022-11-23T02:48:20.1465534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1465642Z p_assert( 2022-11-23T02:48:20.1465987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1466123Z traceback.print_stack() 2022-11-23T02:48:20.1466366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1466607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1466738Z File "", line 1, in 2022-11-23T02:48:20.1466851Z File "", line 1, in 2022-11-23T02:48:20.1467065Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1467211Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1467421Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1467576Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1467789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1467933Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1468136Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1468244Z self.run() 2022-11-23T02:48:20.1468446Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1468599Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1468805Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1468954Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1469169Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1469276Z self.run() 2022-11-23T02:48:20.1469685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1469824Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1470030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1470178Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1470551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1470678Z getattr(self, test_name)() 2022-11-23T02:48:20.1471018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1471152Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1471499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1471600Z fn() 2022-11-23T02:48:20.1471966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1472095Z getattr(self, test_name)() 2022-11-23T02:48:20.1472469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1472595Z test(self, **param_kwargs) 2022-11-23T02:48:20.1473014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1473123Z fn() 2022-11-23T02:48:20.1473520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1473651Z return func(*args, **kwargs) 2022-11-23T02:48:20.1474028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1474154Z test(self, **param_kwargs) 2022-11-23T02:48:20.1474407Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1474528Z self.run_subtests( 2022-11-23T02:48:20.1474894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1475004Z return func(*args, **kwargs) 2022-11-23T02:48:20.1475595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1475765Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1476018Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1476136Z self.run_subtests( 2022-11-23T02:48:20.1476506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1476661Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1477023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1477187Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1477551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1477673Z output = model(*input) 2022-11-23T02:48:20.1478045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1478202Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1478537Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1478683Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1479059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1479280Z output = model(*input) 2022-11-23T02:48:20.1479654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1479834Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1480166Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1480315Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1480691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1480819Z _lazy_init(state, module) 2022-11-23T02:48:20.1481200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1481378Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1481716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1481867Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1482235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1482360Z _lazy_init(state, module) 2022-11-23T02:48:20.1482768Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1482911Z return func(*args, **kwargs) 2022-11-23T02:48:20.1483276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1483421Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1483787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1483891Z p_assert( 2022-11-23T02:48:20.1484232Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1484364Z return func(*args, **kwargs) 2022-11-23T02:48:20.1484708Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1484840Z traceback.print_stack() 2022-11-23T02:48:20.1485224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1485333Z p_assert( 2022-11-23T02:48:20.1485653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1485782Z traceback.print_stack() 2022-11-23T02:48:20.1486025Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1486263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1486397Z File "", line 1, in 2022-11-23T02:48:20.1486616Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1486762Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1486951Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1487107Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1487325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1487433Z self.run() 2022-11-23T02:48:20.1487636Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1487786Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1488140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1488281Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1488636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1488907Z getattr(self, test_name)() 2022-11-23T02:48:20.1489283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1489385Z fn() 2022-11-23T02:48:20.1489757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1489888Z test(self, **param_kwargs) 2022-11-23T02:48:20.1490251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1490380Z return func(*args, **kwargs) 2022-11-23T02:48:20.1490615Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1490731Z self.run_subtests( 2022-11-23T02:48:20.1491090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1491260Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1491633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1491791Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1492240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1492376Z output = model(*input) 2022-11-23T02:48:20.1492693Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1492839Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1493221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1493400Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1493774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1493905Z _lazy_init(state, module) 2022-11-23T02:48:20.1494267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1494413Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1494745Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1494874Z return func(*args, **kwargs) 2022-11-23T02:48:20.1495260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1495370Z p_assert( 2022-11-23T02:48:20.1495714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1495844Z traceback.print_stack() 2022-11-23T02:48:20.1495977Z File "", line 1, in 2022-11-23T02:48:20.1496198Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1496327Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1496533Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1496689Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1496910Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1497019Z self.run() 2022-11-23T02:48:20.1497224Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1497375Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1497709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1497846Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1498216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1498412Z getattr(self, test_name)() 2022-11-23T02:48:20.1498783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1498885Z fn() 2022-11-23T02:48:20.1499254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1499387Z test(self, **param_kwargs) 2022-11-23T02:48:20.1499736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1499869Z return func(*args, **kwargs) 2022-11-23T02:48:20.1500123Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1500245Z self.run_subtests( 2022-11-23T02:48:20.1500608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1500781Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1501153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1501310Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1501724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1501858Z output = model(*input) 2022-11-23T02:48:20.1502197Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1502341Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1502724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1502903Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1503283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1503408Z _lazy_init(state, module) 2022-11-23T02:48:20.1503749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1503897Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1504246Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1504377Z return func(*args, **kwargs) 2022-11-23T02:48:20.1504765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1504875Z p_assert( 2022-11-23T02:48:20.1505219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1505349Z traceback.print_stack() 2022-11-23T02:48:20.1505574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1505817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1505950Z File "", line 1, in 2022-11-23T02:48:20.1506165Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1506318Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1506528Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1506684Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1506815Z File "", line 1, in 2022-11-23T02:48:20.1507013Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1507118Z self.run() 2022-11-23T02:48:20.1507326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1507474Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1507759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1507906Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1508260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1508378Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1508588Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1508739Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1509110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1509235Z getattr(self, test_name)() 2022-11-23T02:48:20.1509452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1509559Z self.run() 2022-11-23T02:48:20.1509931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1510024Z fn() 2022-11-23T02:48:20.1510235Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1510387Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1510812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1510952Z test(self, **param_kwargs) 2022-11-23T02:48:20.1511301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1511437Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1511802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1511913Z return func(*args, **kwargs) 2022-11-23T02:48:20.1512278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1512410Z getattr(self, test_name)() 2022-11-23T02:48:20.1512666Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1512783Z self.run_subtests( 2022-11-23T02:48:20.1513149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1513254Z fn() 2022-11-23T02:48:20.1513597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1513763Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1514137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1514267Z test(self, **param_kwargs) 2022-11-23T02:48:20.1514641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1514804Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1515398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1515534Z return func(*args, **kwargs) 2022-11-23T02:48:20.1515933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1516041Z output = model(*input) 2022-11-23T02:48:20.1516298Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1516414Z self.run_subtests( 2022-11-23T02:48:20.1516747Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1516892Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1517247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1517505Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1517895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1518059Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1518432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1518589Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1518965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1519095Z _lazy_init(state, module) 2022-11-23T02:48:20.1519477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1519604Z output = model(*input) 2022-11-23T02:48:20.1519971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1520100Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1520432Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1520575Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1520982Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1521126Z return func(*args, **kwargs) 2022-11-23T02:48:20.1521517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1521697Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1522088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1522187Z p_assert( 2022-11-23T02:48:20.1522564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1522689Z _lazy_init(state, module) 2022-11-23T02:48:20.1523032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1523165Z traceback.print_stack() 2022-11-23T02:48:20.1523527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1523674Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1524024Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1524136Z return func(*args, **kwargs) 2022-11-23T02:48:20.1524523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1524633Z p_assert( 2022-11-23T02:48:20.1524970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1525099Z traceback.print_stack() 2022-11-23T02:48:20.1525346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1525585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1525720Z File "", line 1, in 2022-11-23T02:48:20.1525920Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1526064Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1526271Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1526425Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1526642Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1526821Z self.run() 2022-11-23T02:48:20.1527030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1527161Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1527509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1527647Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1528022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1528151Z getattr(self, test_name)() 2022-11-23T02:48:20.1528512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1528613Z fn() 2022-11-23T02:48:20.1528987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1529097Z test(self, **param_kwargs) 2022-11-23T02:48:20.1529470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1529598Z return func(*args, **kwargs) 2022-11-23T02:48:20.1529851Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1529969Z self.run_subtests( 2022-11-23T02:48:20.1530382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1530562Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1530940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1531079Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1531458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1531587Z output = model(*input) 2022-11-23T02:48:20.1531922Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1532066Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1532449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1532635Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1533012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1533120Z _lazy_init(state, module) 2022-11-23T02:48:20.1533479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1533624Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1533967Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1534100Z return func(*args, **kwargs) 2022-11-23T02:48:20.1534485Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1534592Z p_assert( 2022-11-23T02:48:20.1534932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1535047Z traceback.print_stack() 2022-11-23T02:48:20.1535179Z File "", line 1, in 2022-11-23T02:48:20.1535392Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1535536Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1535742Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1535897Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1536113Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1536271Z self.run() 2022-11-23T02:48:20.1536485Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1536636Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1536986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1537123Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1537497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1537624Z getattr(self, test_name)() 2022-11-23T02:48:20.1537993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1538078Z fn() 2022-11-23T02:48:20.1538453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1538578Z test(self, **param_kwargs) 2022-11-23T02:48:20.1538945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1539074Z return func(*args, **kwargs) 2022-11-23T02:48:20.1539331Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1539448Z self.run_subtests( 2022-11-23T02:48:20.1539862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1540022Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1540395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1540553Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1540937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1541067Z output = model(*input) 2022-11-23T02:48:20.1541399Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1541545Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1541931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1542097Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1542472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1542599Z _lazy_init(state, module) 2022-11-23T02:48:20.1542955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1543100Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1543444Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1543576Z return func(*args, **kwargs) 2022-11-23T02:48:20.1543963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1544052Z p_assert( 2022-11-23T02:48:20.1544398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1544533Z traceback.print_stack() 2022-11-23T02:48:20.1544776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1545017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1545151Z File "", line 1, in 2022-11-23T02:48:20.1545367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1545513Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1545702Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1545922Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1546142Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1546249Z self.run() 2022-11-23T02:48:20.1546453Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1546607Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1546958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1547095Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1547449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1547574Z getattr(self, test_name)() 2022-11-23T02:48:20.1547938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1548044Z fn() 2022-11-23T02:48:20.1548420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1548545Z test(self, **param_kwargs) 2022-11-23T02:48:20.1548911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1549073Z return func(*args, **kwargs) 2022-11-23T02:48:20.1549341Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1549459Z self.run_subtests( 2022-11-23T02:48:20.1549821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1549986Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1550357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1550519Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1550905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1551030Z output = model(*input) 2022-11-23T02:48:20.1551348Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1551497Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1551886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1552066Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1552438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1552563Z _lazy_init(state, module) 2022-11-23T02:48:20.1552925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1553076Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1553407Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1553534Z return func(*args, **kwargs) 2022-11-23T02:48:20.1553924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1554032Z p_assert( 2022-11-23T02:48:20.1554164Z File "", line 1, in 2022-11-23T02:48:20.1554509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1554640Z traceback.print_stack() 2022-11-23T02:48:20.1554837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1554985Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1555409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1555673Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1555889Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1555997Z self.run() 2022-11-23T02:48:20.1556203Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1556354Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1556697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1556835Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1557204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1557332Z getattr(self, test_name)() 2022-11-23T02:48:20.1557696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1557805Z fn() 2022-11-23T02:48:20.1558181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1558306Z test(self, **param_kwargs) 2022-11-23T02:48:20.1558653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1558847Z return func(*args, **kwargs) 2022-11-23T02:48:20.1559118Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1559235Z self.run_subtests( 2022-11-23T02:48:20.1559597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1559762Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1560133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1560298Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1560665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1560787Z output = model(*input) 2022-11-23T02:48:20.1561119Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1561264Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1561650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1561830Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1562203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1562332Z _lazy_init(state, module) 2022-11-23T02:48:20.1562671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1562823Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1563171Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1563299Z return func(*args, **kwargs) 2022-11-23T02:48:20.1563687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1563794Z p_assert( 2022-11-23T02:48:20.1564140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1564270Z traceback.print_stack() 2022-11-23T02:48:20.1564497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1564729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1564862Z File "", line 1, in 2022-11-23T02:48:20.1565154Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1565300Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1565505Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1565659Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1565861Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1565968Z self.run() 2022-11-23T02:48:20.1566174Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1566326Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1566681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1566817Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1567184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1567315Z getattr(self, test_name)() 2022-11-23T02:48:20.1567663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1567764Z fn() 2022-11-23T02:48:20.1568189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1568325Z test(self, **param_kwargs) 2022-11-23T02:48:20.1568693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1568824Z return func(*args, **kwargs) 2022-11-23T02:48:20.1569078Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1569194Z self.run_subtests( 2022-11-23T02:48:20.1569536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1569711Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1570084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1570239Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1570626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1570750Z output = model(*input) 2022-11-23T02:48:20.1571084Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1571230Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1571597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1571778Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1572155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1572280Z _lazy_init(state, module) 2022-11-23T02:48:20.1572695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1572845Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1573196Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1573326Z return func(*args, **kwargs) 2022-11-23T02:48:20.1573733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1573840Z p_assert( 2022-11-23T02:48:20.1574182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1574312Z traceback.print_stack() 2022-11-23T02:48:20.1574508Z File "", line 1, in 2022-11-23T02:48:20.1574722Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1574867Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1575075Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1575214Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1575436Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1575544Z self.run() 2022-11-23T02:48:20.1575749Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1575897Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1576250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1576386Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1576733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1576867Z getattr(self, test_name)() 2022-11-23T02:48:20.1577233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1577339Z fn() 2022-11-23T02:48:20.1577762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1577900Z test(self, **param_kwargs) 2022-11-23T02:48:20.1578263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1578393Z return func(*args, **kwargs) 2022-11-23T02:48:20.1578632Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:48:20.1578750Z self.run_subtests( 2022-11-23T02:48:20.1579109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1579282Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1579657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1579814Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1580200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1580325Z output = model(*input) 2022-11-23T02:48:20.1580640Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1580785Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1581166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1581345Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1581726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1581853Z _lazy_init(state, module) 2022-11-23T02:48:20.1582213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1582359Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1582691Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1582821Z return func(*args, **kwargs) 2022-11-23T02:48:20.1583205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1583314Z p_assert( 2022-11-23T02:48:20.1583658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1583790Z traceback.print_stack() 2022-11-23T02:48:20.1584096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1584335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1584554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1584797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1585029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1585261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1585494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1585723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1585954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1586192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1586419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1586634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1587708Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1587961Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:48:20.1588963Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1589209Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:48:20.1589449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1589686Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1589920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1590152Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1590386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1590615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1590829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1591064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1591293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1591519Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1591745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1591974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1592203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1592498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1592708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1592938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1593171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1593400Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1593626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1593848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1594083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1594314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1594550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1594762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1594988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1595524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1595768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1595999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1596226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1596451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1596683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1596920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1597134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1597360Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1597590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1597820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1597940Z dist init r=1, world=2 2022-11-23T02:48:20.1598282Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1598609Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1598950Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1599277Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1599577Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1599912Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1600236Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1600628Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1600959Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1601284Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1601597Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1601924Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1602042Z dist init r=0, world=2 2022-11-23T02:48:20.1602362Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1602673Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1603021Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1603344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1603653Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1603978Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1604300Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1604614Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1604938Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1605252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1605565Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1605889Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1605997Z ok (35.167s) 2022-11-23T02:48:20.1606206Z test_delayed_reduce_scatter_offload_false_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1606529Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85627 2022-11-23T02:48:20.1606758Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85628 2022-11-23T02:48:20.1607159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1607341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1607732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1607990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1608367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1608546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1608916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1609109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1609360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.1609608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.1610021Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1610436Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1610669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.1610902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.1611191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1611417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1612456Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1612581Z warnings.warn( 2022-11-23T02:48:20.1613594Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1613711Z warnings.warn( 2022-11-23T02:48:20.1613946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1614184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1614421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1614664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1614895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1615123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1615340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1615575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1615806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1616036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1616262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1616492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1616797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1617028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1617237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1617469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1617696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1617925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1618966Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1619180Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.1620261Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1620482Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.1620714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1620953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1621189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1621403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1621639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1621867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1622097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1622327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1622557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1622787Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1623019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1623251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1623463Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1623696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1623925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1624150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1624379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1624611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1624841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1625130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1625337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1625568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1625799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1626030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1626797Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1627535Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1627780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1628111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1628355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1628585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1628683Z dist init r=0, world=2 2022-11-23T02:48:20.1628796Z dist init r=1, world=2 2022-11-23T02:48:20.1628899Z ok (4.912s) 2022-11-23T02:48:20.1629117Z test_delayed_reduce_scatter_offload_false_none (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1630047Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82704 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:48:20.1630281Z test_delayed_reduce_scatter_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1631173Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82398 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:48:20.1631395Z test_delayed_reduce_scatter_offload_true_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1631718Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85710 2022-11-23T02:48:20.1631944Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85711 2022-11-23T02:48:20.1632312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1632494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1632885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1633085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1633458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1633637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1634089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1634287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1634537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.1634771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.1635506Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1635934Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1636170Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.1636407Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.1636652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1636892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1638008Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1638139Z warnings.warn( 2022-11-23T02:48:20.1639157Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1639280Z warnings.warn( 2022-11-23T02:48:20.1639397Z File "", line 1, in 2022-11-23T02:48:20.1639621Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1639769Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1639978Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1640133Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1640352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1640461Z self.run() 2022-11-23T02:48:20.1640668Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1640807Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1641163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1641304Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1641676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1641807Z getattr(self, test_name)() 2022-11-23T02:48:20.1642179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1642283Z fn() 2022-11-23T02:48:20.1642642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1642769Z test(self, **param_kwargs) 2022-11-23T02:48:20.1643137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1643347Z return func(*args, **kwargs) 2022-11-23T02:48:20.1643607Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1643724Z self.run_subtests( 2022-11-23T02:48:20.1644095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1644267Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1644623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1644782Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1645168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1645294Z output = model(*input) 2022-11-23T02:48:20.1645627Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1645779Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1646176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1646358Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1646790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1646907Z _lazy_init(state, module) 2022-11-23T02:48:20.1647271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1647418Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1647762Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1647890Z return func(*args, **kwargs) 2022-11-23T02:48:20.1648281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1648401Z p_assert( 2022-11-23T02:48:20.1648752Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1648866Z traceback.print_stack() 2022-11-23T02:48:20.1648998Z File "", line 1, in 2022-11-23T02:48:20.1649219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1649367Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1649575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1649728Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1649945Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1650034Z self.run() 2022-11-23T02:48:20.1650243Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1650400Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1650751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1650889Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1651257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1651390Z getattr(self, test_name)() 2022-11-23T02:48:20.1651765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1651850Z fn() 2022-11-23T02:48:20.1652226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1652354Z test(self, **param_kwargs) 2022-11-23T02:48:20.1652722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1652914Z return func(*args, **kwargs) 2022-11-23T02:48:20.1653177Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1653296Z self.run_subtests( 2022-11-23T02:48:20.1653664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1653821Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1654193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1654351Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1654740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1654864Z output = model(*input) 2022-11-23T02:48:20.1655195Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1655347Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1655733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1655899Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1656323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1656457Z _lazy_init(state, module) 2022-11-23T02:48:20.1656820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1656967Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1657312Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1657441Z return func(*args, **kwargs) 2022-11-23T02:48:20.1657829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1657927Z p_assert( 2022-11-23T02:48:20.1658276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1658407Z traceback.print_stack() 2022-11-23T02:48:20.1658653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1658898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1659034Z File "", line 1, in 2022-11-23T02:48:20.1659253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1659400Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1659587Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1659742Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1659966Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1660074Z self.run() 2022-11-23T02:48:20.1660280Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1660429Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1660783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1660904Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1661277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1661406Z getattr(self, test_name)() 2022-11-23T02:48:20.1661774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1661877Z fn() 2022-11-23T02:48:20.1662249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1662438Z test(self, **param_kwargs) 2022-11-23T02:48:20.1662808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1662921Z return func(*args, **kwargs) 2022-11-23T02:48:20.1663056Z File "", line 1, in 2022-11-23T02:48:20.1663322Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1663445Z self.run_subtests( 2022-11-23T02:48:20.1663807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1663976Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1664192Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1664340Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1664704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1664863Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1665070Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1665226Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1665664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1665797Z output = model(*input) 2022-11-23T02:48:20.1666018Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1666126Z self.run() 2022-11-23T02:48:20.1666445Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1666590Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1666799Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1666958Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1667347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1667531Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1667883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1668004Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1668381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1668508Z _lazy_init(state, module) 2022-11-23T02:48:20.1668875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1669005Z getattr(self, test_name)() 2022-11-23T02:48:20.1669372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1669524Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1669894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1669978Z fn() 2022-11-23T02:48:20.1670325Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1670458Z return func(*args, **kwargs) 2022-11-23T02:48:20.1670833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1670959Z test(self, **param_kwargs) 2022-11-23T02:48:20.1671348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1671458Z p_assert( 2022-11-23T02:48:20.1671826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1672006Z return func(*args, **kwargs) 2022-11-23T02:48:20.1672353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1672482Z traceback.print_stack() 2022-11-23T02:48:20.1672745Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1672865Z self.run_subtests( 2022-11-23T02:48:20.1673226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1673434Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1673814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1673952Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1674342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1674466Z output = model(*input) 2022-11-23T02:48:20.1674797Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1674941Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1675621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1675816Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1676194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1676303Z _lazy_init(state, module) 2022-11-23T02:48:20.1676663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1676819Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1677168Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1677297Z return func(*args, **kwargs) 2022-11-23T02:48:20.1677679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1677788Z p_assert( 2022-11-23T02:48:20.1678136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1678250Z traceback.print_stack() 2022-11-23T02:48:20.1678492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1678732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1678868Z File "", line 1, in 2022-11-23T02:48:20.1679083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1679235Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1679443Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1679600Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1679801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1679911Z self.run() 2022-11-23T02:48:20.1680123Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1680274Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1680629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1680766Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1681135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1681267Z getattr(self, test_name)() 2022-11-23T02:48:20.1681720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1681822Z fn() 2022-11-23T02:48:20.1682195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1682324Z test(self, **param_kwargs) 2022-11-23T02:48:20.1682694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1682826Z return func(*args, **kwargs) 2022-11-23T02:48:20.1683086Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1683205Z self.run_subtests( 2022-11-23T02:48:20.1683550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1683719Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1684097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1684255Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1684635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1684812Z output = model(*input) 2022-11-23T02:48:20.1685157Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1685300Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1685664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1685848Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1686223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1686356Z _lazy_init(state, module) 2022-11-23T02:48:20.1686715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1686864Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1687211Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1687345Z return func(*args, **kwargs) 2022-11-23T02:48:20.1687714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1687819Z p_assert( 2022-11-23T02:48:20.1688165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1688296Z traceback.print_stack() 2022-11-23T02:48:20.1688429Z File "", line 1, in 2022-11-23T02:48:20.1688642Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1688794Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1688984Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1689145Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1689365Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1689476Z self.run() 2022-11-23T02:48:20.1689683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1689834Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1690186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1690324Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1690679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1690805Z getattr(self, test_name)() 2022-11-23T02:48:20.1691241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1691345Z fn() 2022-11-23T02:48:20.1691720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1691844Z test(self, **param_kwargs) 2022-11-23T02:48:20.1692210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1692343Z return func(*args, **kwargs) 2022-11-23T02:48:20.1692585Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1692702Z self.run_subtests( 2022-11-23T02:48:20.1693069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1693239Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1693612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1693769Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1694152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1694330Z output = model(*input) 2022-11-23T02:48:20.1694658Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1694802Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1695186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1695367Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1695742Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1695876Z _lazy_init(state, module) 2022-11-23T02:48:20.1696236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1696382Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1696709Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1696843Z return func(*args, **kwargs) 2022-11-23T02:48:20.1697238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1697346Z p_assert( 2022-11-23T02:48:20.1697692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1697825Z traceback.print_stack() 2022-11-23T02:48:20.1698071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1698320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1698437Z File "", line 1, in 2022-11-23T02:48:20.1698567Z File "", line 1, in 2022-11-23T02:48:20.1698781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1698927Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1699140Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1699297Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1699511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1699638Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1699857Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1699964Z self.run() 2022-11-23T02:48:20.1700168Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1700398Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1700606Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1700757Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1700974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1701068Z self.run() 2022-11-23T02:48:20.1701427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1701564Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1701771Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1701922Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1702296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1702428Z getattr(self, test_name)() 2022-11-23T02:48:20.1702776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1702894Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1703264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1703367Z fn() 2022-11-23T02:48:20.1703789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1703925Z getattr(self, test_name)() 2022-11-23T02:48:20.1704300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1704426Z test(self, **param_kwargs) 2022-11-23T02:48:20.1704778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1704883Z fn() 2022-11-23T02:48:20.1705258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1705390Z return func(*args, **kwargs) 2022-11-23T02:48:20.1705762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1705890Z test(self, **param_kwargs) 2022-11-23T02:48:20.1706152Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1706271Z self.run_subtests( 2022-11-23T02:48:20.1706619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1706748Z return func(*args, **kwargs) 2022-11-23T02:48:20.1707113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1707281Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1707545Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1707662Z self.run_subtests( 2022-11-23T02:48:20.1708036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1708201Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1708546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1708716Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1709100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1709227Z output = model(*input) 2022-11-23T02:48:20.1709599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1709817Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1710151Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1710296Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1710666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1710791Z output = model(*input) 2022-11-23T02:48:20.1711178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1711359Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1711695Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1711843Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1712221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1712354Z _lazy_init(state, module) 2022-11-23T02:48:20.1712740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1712900Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1713312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1713469Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1713846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1713973Z _lazy_init(state, module) 2022-11-23T02:48:20.1714320Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1714449Z return func(*args, **kwargs) 2022-11-23T02:48:20.1714817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1714951Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1715583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1715688Z p_assert( 2022-11-23T02:48:20.1716039Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1716169Z return func(*args, **kwargs) 2022-11-23T02:48:20.1716516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1716647Z traceback.print_stack() 2022-11-23T02:48:20.1717029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1717119Z p_assert( 2022-11-23T02:48:20.1717461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1717598Z traceback.print_stack() 2022-11-23T02:48:20.1717842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1718084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1718223Z File "", line 1, in 2022-11-23T02:48:20.1718442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1718568Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1718774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1718932Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1719150Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1719260Z self.run() 2022-11-23T02:48:20.1719467Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1719712Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1720065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1720185Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1720560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1720691Z getattr(self, test_name)() 2022-11-23T02:48:20.1721059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1721163Z fn() 2022-11-23T02:48:20.1721534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1721665Z test(self, **param_kwargs) 2022-11-23T02:48:20.1722026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1722142Z return func(*args, **kwargs) 2022-11-23T02:48:20.1722399Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1722520Z self.run_subtests( 2022-11-23T02:48:20.1722947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1723128Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1723497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1723657Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1724044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1724150Z output = model(*input) 2022-11-23T02:48:20.1724489Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1724637Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1725021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1725202Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1725582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1725711Z _lazy_init(state, module) 2022-11-23T02:48:20.1726069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1726197Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1726543Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1726672Z return func(*args, **kwargs) 2022-11-23T02:48:20.1727066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1727173Z p_assert( 2022-11-23T02:48:20.1727516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1727647Z traceback.print_stack() 2022-11-23T02:48:20.1727785Z File "", line 1, in 2022-11-23T02:48:20.1727986Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1728134Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1728341Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1728495Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1728713Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1728826Z self.run() 2022-11-23T02:48:20.1729101Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1729233Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1729582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1729719Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1730093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1730225Z getattr(self, test_name)() 2022-11-23T02:48:20.1730595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1730698Z fn() 2022-11-23T02:48:20.1731072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1731182Z test(self, **param_kwargs) 2022-11-23T02:48:20.1731549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1731683Z return func(*args, **kwargs) 2022-11-23T02:48:20.1731946Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1732065Z self.run_subtests( 2022-11-23T02:48:20.1732478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1732656Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1733026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1733167Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1733552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1733678Z output = model(*input) 2022-11-23T02:48:20.1734017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1734164Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1734552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1734737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1735112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1735222Z _lazy_init(state, module) 2022-11-23T02:48:20.1735581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1735742Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1736087Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1736221Z return func(*args, **kwargs) 2022-11-23T02:48:20.1736609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1736718Z p_assert( 2022-11-23T02:48:20.1737061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1737175Z traceback.print_stack() 2022-11-23T02:48:20.1737424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1737659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1737796Z File "", line 1, in 2022-11-23T02:48:20.1738011Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1738155Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1738361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1738578Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1738778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1738886Z self.run() 2022-11-23T02:48:20.1739091Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1739241Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1739596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1739736Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1740109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1740219Z getattr(self, test_name)() 2022-11-23T02:48:20.1740590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1740693Z fn() 2022-11-23T02:48:20.1741073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1741202Z test(self, **param_kwargs) 2022-11-23T02:48:20.1741336Z File "", line 1, in 2022-11-23T02:48:20.1741705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1741890Z return func(*args, **kwargs) 2022-11-23T02:48:20.1742141Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1742257Z self.run_subtests( 2022-11-23T02:48:20.1742474Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1742622Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1742991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1743163Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1743370Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1743527Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1743882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1744043Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1744264Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1744373Z self.run() 2022-11-23T02:48:20.1744762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1744886Z output = model(*input) 2022-11-23T02:48:20.1745094Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1745247Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1745572Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1745717Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1746064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1746204Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1746593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1746775Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1747147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1747276Z getattr(self, test_name)() 2022-11-23T02:48:20.1747634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1747840Z _lazy_init(state, module) 2022-11-23T02:48:20.1748211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1748313Z fn() 2022-11-23T02:48:20.1748671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1748826Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1749203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1749332Z test(self, **param_kwargs) 2022-11-23T02:48:20.1749661Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1749791Z return func(*args, **kwargs) 2022-11-23T02:48:20.1750157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1750295Z return func(*args, **kwargs) 2022-11-23T02:48:20.1750685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1750793Z p_assert( 2022-11-23T02:48:20.1751053Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1751173Z self.run_subtests( 2022-11-23T02:48:20.1751552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1751694Z traceback.print_stack() 2022-11-23T02:48:20.1752058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1752226Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1752597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1752762Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1753148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1753275Z output = model(*input) 2022-11-23T02:48:20.1753594Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1753745Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1754133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1754318Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1754696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1754822Z _lazy_init(state, module) 2022-11-23T02:48:20.1755403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1755564Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1755904Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1756033Z return func(*args, **kwargs) 2022-11-23T02:48:20.1756426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1756537Z p_assert( 2022-11-23T02:48:20.1756883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1757015Z traceback.print_stack() 2022-11-23T02:48:20.1757255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1757498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1757616Z File "", line 1, in 2022-11-23T02:48:20.1757930Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1758079Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1758287Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1758442Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1758666Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1758777Z self.run() 2022-11-23T02:48:20.1758970Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1759121Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1759474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1759612Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1759983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1760217Z getattr(self, test_name)() 2022-11-23T02:48:20.1760577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1760964Z fn() 2022-11-23T02:48:20.1761390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1761628Z test(self, **param_kwargs) 2022-11-23T02:48:20.1762051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1762231Z return func(*args, **kwargs) 2022-11-23T02:48:20.1762531Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1762631Z self.run_subtests( 2022-11-23T02:48:20.1763032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1763264Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1763734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1763935Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1764362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1764534Z output = model(*input) 2022-11-23T02:48:20.1764909Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1765090Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1765460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1765676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1766148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1766357Z _lazy_init(state, module) 2022-11-23T02:48:20.1766759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1766950Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1767340Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1767507Z return func(*args, **kwargs) 2022-11-23T02:48:20.1767879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1768022Z p_assert( 2022-11-23T02:48:20.1768192Z File "", line 1, in 2022-11-23T02:48:20.1768573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1768779Z traceback.print_stack() 2022-11-23T02:48:20.1769110Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1769291Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1769480Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1769674Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1769927Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1770071Z self.run() 2022-11-23T02:48:20.1770361Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1770548Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1770969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1771138Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1771493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1771651Z getattr(self, test_name)() 2022-11-23T02:48:20.1772056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1772197Z fn() 2022-11-23T02:48:20.1772612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1772840Z test(self, **param_kwargs) 2022-11-23T02:48:20.1773253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1773505Z return func(*args, **kwargs) 2022-11-23T02:48:20.1773760Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1773916Z self.run_subtests( 2022-11-23T02:48:20.1774324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1774537Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1774947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1775181Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1775596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1775745Z output = model(*input) 2022-11-23T02:48:20.1776064Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1776276Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1776692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1776894Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1777300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1777478Z _lazy_init(state, module) 2022-11-23T02:48:20.1777880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1778064Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1778402Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1778572Z return func(*args, **kwargs) 2022-11-23T02:48:20.1779033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1779179Z p_assert( 2022-11-23T02:48:20.1779565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1779740Z traceback.print_stack() 2022-11-23T02:48:20.1780017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1780415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1780534Z File "", line 1, in 2022-11-23T02:48:20.1780699Z File "", line 1, in 2022-11-23T02:48:20.1780951Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1781175Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1781433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1781629Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1781880Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1782008Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1782261Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1782404Z self.run() 2022-11-23T02:48:20.1782651Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1782843Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1783124Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1783314Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1783616Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1783714Z self.run() 2022-11-23T02:48:20.1784176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1784350Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1784594Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1784779Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1785204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1785413Z getattr(self, test_name)() 2022-11-23T02:48:20.1785805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1785926Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1786340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1786481Z fn() 2022-11-23T02:48:20.1786889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1787055Z getattr(self, test_name)() 2022-11-23T02:48:20.1787476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1787641Z test(self, **param_kwargs) 2022-11-23T02:48:20.1787997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1788180Z fn() 2022-11-23T02:48:20.1788584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1788754Z return func(*args, **kwargs) 2022-11-23T02:48:20.1789221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1789390Z test(self, **param_kwargs) 2022-11-23T02:48:20.1789698Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1789853Z self.run_subtests( 2022-11-23T02:48:20.1790204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1790371Z return func(*args, **kwargs) 2022-11-23T02:48:20.1790804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1791074Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1791370Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1791538Z self.run_subtests( 2022-11-23T02:48:20.1791956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1792159Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1792502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1792708Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1793124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1793324Z output = model(*input) 2022-11-23T02:48:20.1793735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1793993Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1794368Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1794549Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1794967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1795410Z output = model(*input) 2022-11-23T02:48:20.1795853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1796074Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1796490Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1796685Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1797105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1797269Z _lazy_init(state, module) 2022-11-23T02:48:20.1797692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1797861Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1798258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1798447Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1798859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1799069Z _lazy_init(state, module) 2022-11-23T02:48:20.1799507Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1799682Z return func(*args, **kwargs) 2022-11-23T02:48:20.1800081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1800212Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1800635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1800782Z p_assert( 2022-11-23T02:48:20.1801169Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1801347Z return func(*args, **kwargs) 2022-11-23T02:48:20.1801767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1801939Z traceback.print_stack() 2022-11-23T02:48:20.1802311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1802557Z p_assert( 2022-11-23T02:48:20.1802946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1803116Z traceback.print_stack() 2022-11-23T02:48:20.1803405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1803688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1803861Z File "", line 1, in 2022-11-23T02:48:20.1804205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1804339Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1804581Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1804770Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1805022Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1805185Z self.run() 2022-11-23T02:48:20.1805433Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1805619Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1806011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1806132Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1806654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1806836Z getattr(self, test_name)() 2022-11-23T02:48:20.1807248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1807396Z fn() 2022-11-23T02:48:20.1807806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1807967Z test(self, **param_kwargs) 2022-11-23T02:48:20.1808378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1808495Z return func(*args, **kwargs) 2022-11-23T02:48:20.1808847Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1809042Z self.run_subtests( 2022-11-23T02:48:20.1809449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1809666Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1810073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1810270Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1810691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1810807Z output = model(*input) 2022-11-23T02:48:20.1811180Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1811361Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1811819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1812057Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1812479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1812645Z _lazy_init(state, module) 2022-11-23T02:48:20.1813046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1813179Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1813561Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1813848Z return func(*args, **kwargs) 2022-11-23T02:48:20.1814289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1814471Z p_assert( 2022-11-23T02:48:20.1814864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1815034Z traceback.print_stack() 2022-11-23T02:48:20.1815207Z File "", line 1, in 2022-11-23T02:48:20.1815406Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1815586Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1815827Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1816026Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1816279Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1816465Z self.run() 2022-11-23T02:48:20.1816711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1816846Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1817235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1817410Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1817866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1818108Z getattr(self, test_name)() 2022-11-23T02:48:20.1818522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1818662Z fn() 2022-11-23T02:48:20.1819117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1819235Z test(self, **param_kwargs) 2022-11-23T02:48:20.1819648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1819814Z return func(*args, **kwargs) 2022-11-23T02:48:20.1820106Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1820269Z self.run_subtests( 2022-11-23T02:48:20.1820675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1820884Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1821296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1821441Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1821900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1822068Z output = model(*input) 2022-11-23T02:48:20.1822439Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1822630Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1823111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1823331Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1823750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1823859Z _lazy_init(state, module) 2022-11-23T02:48:20.1824255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1824474Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1824867Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1825107Z return func(*args, **kwargs) 2022-11-23T02:48:20.1825534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1825676Z p_assert( 2022-11-23T02:48:20.1826068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1826185Z traceback.print_stack() 2022-11-23T02:48:20.1826464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1826743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1826964Z File "", line 1, in 2022-11-23T02:48:20.1827218Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1827404Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1827696Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1827903Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1828111Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1828256Z self.run() 2022-11-23T02:48:20.1828499Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1828746Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1828964Z File "", line 1, in 2022-11-23T02:48:20.1829364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1829538Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1829892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1830057Z getattr(self, test_name)() 2022-11-23T02:48:20.1830304Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1830493Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1830921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1831062Z fn() 2022-11-23T02:48:20.1831337Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1831535Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1831898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1832119Z test(self, **param_kwargs) 2022-11-23T02:48:20.1832372Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1832513Z self.run() 2022-11-23T02:48:20.1832935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1833106Z return func(*args, **kwargs) 2022-11-23T02:48:20.1833355Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1833491Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1833827Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1833980Z self.run_subtests( 2022-11-23T02:48:20.1834370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1834545Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1834955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1835497Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1835927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1836197Z getattr(self, test_name)() 2022-11-23T02:48:20.1836563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1836808Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1837264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1837421Z fn() 2022-11-23T02:48:20.1837844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1838002Z output = model(*input) 2022-11-23T02:48:20.1838412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1838523Z test(self, **param_kwargs) 2022-11-23T02:48:20.1838899Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1839085Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1839529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1839708Z return func(*args, **kwargs) 2022-11-23T02:48:20.1840130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1840415Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1840727Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1840881Z self.run_subtests( 2022-11-23T02:48:20.1841248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1841410Z _lazy_init(state, module) 2022-11-23T02:48:20.1841803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1842127Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1842535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1842717Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1843128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1843329Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1843661Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1843827Z return func(*args, **kwargs) 2022-11-23T02:48:20.1844246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1844442Z output = model(*input) 2022-11-23T02:48:20.1844908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1845060Z p_assert( 2022-11-23T02:48:20.1845433Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1845613Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1845951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1846121Z traceback.print_stack() 2022-11-23T02:48:20.1846541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1846767Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1847180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1847428Z _lazy_init(state, module) 2022-11-23T02:48:20.1847900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1848085Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1848416Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1848583Z return func(*args, **kwargs) 2022-11-23T02:48:20.1849020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1849164Z p_assert( 2022-11-23T02:48:20.1849545Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1849710Z traceback.print_stack() 2022-11-23T02:48:20.1850026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1850257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1850434Z File "", line 1, in 2022-11-23T02:48:20.1850685Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1850876Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1851121Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1851313Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1851665Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1851822Z self.run() 2022-11-23T02:48:20.1852015Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1852234Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1852631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1852811Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1853222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1853391Z getattr(self, test_name)() 2022-11-23T02:48:20.1853794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1853932Z fn() 2022-11-23T02:48:20.1854293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1854459Z test(self, **param_kwargs) 2022-11-23T02:48:20.1854897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1855076Z return func(*args, **kwargs) 2022-11-23T02:48:20.1855375Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1855526Z self.run_subtests( 2022-11-23T02:48:20.1855928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1856140Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1856498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1856744Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1857170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1857378Z output = model(*input) 2022-11-23T02:48:20.1857754Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1857937Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1858363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1858583Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1859010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1859179Z _lazy_init(state, module) 2022-11-23T02:48:20.1859588Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1859777Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1860201Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1860370Z return func(*args, **kwargs) 2022-11-23T02:48:20.1860793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1860937Z p_assert( 2022-11-23T02:48:20.1861267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1861432Z traceback.print_stack() 2022-11-23T02:48:20.1861666Z File "", line 1, in 2022-11-23T02:48:20.1861916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1862100Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1862380Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1862578Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1862841Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1862996Z self.run() 2022-11-23T02:48:20.1863240Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1863435Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1863825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1864000Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1864408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1864615Z getattr(self, test_name)() 2022-11-23T02:48:20.1864973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1865113Z fn() 2022-11-23T02:48:20.1865527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1887516Z test(self, **param_kwargs) 2022-11-23T02:48:20.1887978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1888110Z return func(*args, **kwargs) 2022-11-23T02:48:20.1888371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1888487Z self.run_subtests( 2022-11-23T02:48:20.1888855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1889033Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1889395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1889553Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1889942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1890067Z output = model(*input) 2022-11-23T02:48:20.1890408Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1890552Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1890936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1891116Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1891681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1891810Z _lazy_init(state, module) 2022-11-23T02:48:20.1892168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1892321Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1892670Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1892798Z return func(*args, **kwargs) 2022-11-23T02:48:20.1893185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1893291Z p_assert( 2022-11-23T02:48:20.1893621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1893754Z traceback.print_stack() 2022-11-23T02:48:20.1894001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1894240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1894372Z File "", line 1, in 2022-11-23T02:48:20.1894665Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1894829Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1895036Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1895170Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1895388Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1895493Z self.run() 2022-11-23T02:48:20.1895698Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1895847Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1896208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1896344Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1896715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1896826Z getattr(self, test_name)() 2022-11-23T02:48:20.1897196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1897298Z fn() 2022-11-23T02:48:20.1897668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1897793Z test(self, **param_kwargs) 2022-11-23T02:48:20.1898150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1898278Z return func(*args, **kwargs) 2022-11-23T02:48:20.1898526Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1898644Z self.run_subtests( 2022-11-23T02:48:20.1899004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1899172Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1899540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1899699Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1900085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1900207Z output = model(*input) 2022-11-23T02:48:20.1900519Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1900791Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1901183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1901364Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1901743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1901868Z _lazy_init(state, module) 2022-11-23T02:48:20.1902224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1902371Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1902717Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1902828Z return func(*args, **kwargs) 2022-11-23T02:48:20.1903215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1903325Z p_assert( 2022-11-23T02:48:20.1903666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1903795Z traceback.print_stack() 2022-11-23T02:48:20.1903926Z File "", line 1, in 2022-11-23T02:48:20.1904197Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.1904335Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.1904541Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.1904693Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.1904907Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.1905015Z self.run() 2022-11-23T02:48:20.1905219Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.1905368Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.1905730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.1905850Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.1906221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.1906352Z getattr(self, test_name)() 2022-11-23T02:48:20.1906722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.1906826Z fn() 2022-11-23T02:48:20.1907198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.1907325Z test(self, **param_kwargs) 2022-11-23T02:48:20.1907688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.1907799Z return func(*args, **kwargs) 2022-11-23T02:48:20.1908063Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:48:20.1908181Z self.run_subtests( 2022-11-23T02:48:20.1908541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.1908712Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.1909083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.1909239Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.1909623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.1909729Z output = model(*input) 2022-11-23T02:48:20.1910059Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.1910271Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.1910660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.1910840Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.1911218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.1911344Z _lazy_init(state, module) 2022-11-23T02:48:20.1911702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.1911833Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.1912181Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.1912312Z return func(*args, **kwargs) 2022-11-23T02:48:20.1912695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.1912809Z p_assert( 2022-11-23T02:48:20.1913154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.1913283Z traceback.print_stack() 2022-11-23T02:48:20.1913525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1913800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1914044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1914282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1914517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1914748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1914983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1915443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1915679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1915901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1916130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1916358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1917384Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1917629Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:48:20.1918640Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1918878Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:48:20.1919117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1919350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1919683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1919919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1920136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1920368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1920600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1920829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1921059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1921290Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1921517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1921752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1921979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1922194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1922487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1922727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1922955Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1923181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1923408Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1923641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1923867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1924080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1924312Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1924543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1925313Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1926053Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1926295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1926529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1926757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1926988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1927224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1927438Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1927670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1927962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1928193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1928422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1928654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1928881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.1928996Z dist init r=1, world=2 2022-11-23T02:48:20.1929336Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1929646Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1929965Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1930297Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1930668Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1930992Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1931320Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1931641Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1931959Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1932295Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1932614Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1932925Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.1933023Z dist init r=0, world=2 2022-11-23T02:48:20.1933333Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1933644Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1933952Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1934274Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1934595Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1934911Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1935294Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1935615Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1935924Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1936239Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1936549Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1936845Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.1936950Z ok (5.213s) 2022-11-23T02:48:20.1937166Z test_delayed_reduce_scatter_offload_true_none (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1938137Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82399 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:48:20.1938379Z test_delayed_reduce_scatter_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:48:20.1939274Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82403 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:48:20.1939637Z test_mixture_of_experts_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85793 2022-11-23T02:48:20.1939864Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85794 2022-11-23T02:48:20.1940250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1940430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1940820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1941001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1941381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1941557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1941939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1942137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1942388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.1942633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.1943046Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1943450Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1943734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.1943968Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.1945004Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1945123Z warnings.warn( 2022-11-23T02:48:20.1945369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.1946440Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1946567Z warnings.warn( 2022-11-23T02:48:20.1946818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.1947224Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.1947625Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.1947873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.1948105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.1948514Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.1948922Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.1949168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.1949412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.1949813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.1950211Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.1950978Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1951719Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1951965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.1952206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.1952603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.1953062Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.1953309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.1953551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.1953955Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.1954356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.1954601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.1954842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.1955489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.1955899Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.1956205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.1956459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.1956860Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.1957254Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.1958287Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1958506Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.1959536Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.1959743Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.1959995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.1960241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.1960643Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.1961048Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.1961280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.1961521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.1961928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.1962327Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.1962654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.1962895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.1963307Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.1963704Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.1963950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.1964171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.1964574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.1964977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.1965791Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1966047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.1966286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.1966690Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.1967088Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.1967338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.1967579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.1967983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.1968720Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1969478Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1970221Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1970628Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.1971357Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1972152Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1972906Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1973156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.1973397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.1973846Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.1974250Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.1974497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.1974786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.1975204Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.1975598Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.1976332Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1976588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.1976830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.1977239Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.1977637Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.1977880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.1978117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.1978516Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.1978919Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.1979165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.1979393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.1979792Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.1980193Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.1980439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.1980677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.1981143Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.1981539Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.1982293Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1982541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.1982781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.1983164Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.1983568Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.1983809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.1984098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.1984512Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.1984902Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.1985145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.1985385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.1985793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.1986189Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.1986422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.1986659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.1987060Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.1987455Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.1988198Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.1988447Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.1988691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.1989090Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.1989489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.1989606Z dist init r=1, world=2 2022-11-23T02:48:20.1989702Z dist init r=0, world=2 2022-11-23T02:48:20.1989805Z ok (6.215s) 2022-11-23T02:48:20.1990155Z test_mixture_of_experts_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86116 2022-11-23T02:48:20.1990446Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86117 2022-11-23T02:48:20.1990831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1991015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1991409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1991606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1991963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.1992141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.1992521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.1992718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.1992968Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.1993215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.1993674Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1994092Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.1994329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.1994544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.1995905Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1996039Z warnings.warn( 2022-11-23T02:48:20.1997057Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.1997171Z warnings.warn( 2022-11-23T02:48:20.1997420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.1997666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.1998069Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.1998471Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.1998716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.1998959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.1999346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.1999750Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2000099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.2000338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.2000742Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2001140Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2001899Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2002151Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.2002957Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2003218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.2003621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2004000Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2004246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.2004493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.2004897Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2005301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2005548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.2005792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.2006186Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2006582Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2007335Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2007587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.2007819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.2008218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2008621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2009371Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2009682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.2009924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.2010333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2010731Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2010981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.2011221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.2011613Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2012016Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2012813Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2013073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.2013313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.2013720Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2014127Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2014881Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2015131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.2015532Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2015775Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.2016160Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2016412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.2016649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.2017055Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2017454Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2018200Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2018446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.2018768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.2019170Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2019572Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2020328Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2020560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.2020801Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.2021207Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2022006Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2022422Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2022668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.2022907Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.2023303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2023704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2024450Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2024704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.2024930Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.2025328Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2025729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2025974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.2026214Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.2026613Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2027012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2027764Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2028073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.2028314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.2028723Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2029110Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2029350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.2029586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.2029981Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2030381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2030629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.2030869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.2031316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2031722Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2031951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.2032189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.2032586Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2032990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2033749Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2034496Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2034744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.2034988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.2035635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2036043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2036288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.2036511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.2036911Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2037303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2037643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.2037881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.2038286Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2038685Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2038802Z dist init r=1, world=2 2022-11-23T02:48:20.2038913Z dist init r=0, world=2 2022-11-23T02:48:20.2039000Z ok (6.415s) 2022-11-23T02:48:20.2039355Z test_mixture_of_experts_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86439 2022-11-23T02:48:20.2039579Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86440 2022-11-23T02:48:20.2039967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2040146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2040532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2040850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2041240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2041403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2041789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2041982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2042235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.2042484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.2042891Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2043298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2043535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.2043768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.2044800Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2044925Z warnings.warn( 2022-11-23T02:48:20.2045158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.2046184Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2046300Z warnings.warn( 2022-11-23T02:48:20.2046547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.2047020Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2047419Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2047668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.2047912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.2048311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2048716Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2048943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.2049191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.2049589Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2050037Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2050810Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2051551Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2051799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.2052042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.2052448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2052850Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2053092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.2053313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.2053712Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2054118Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2054365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.2054605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.2055002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2055401Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2056153Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2056466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.2056710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.2057097Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2057496Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2058246Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2058497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.2058740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.2059140Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2059587Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2059845Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.2060085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.2060489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2060888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2061635Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2061886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.2062124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.2062525Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2062924Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2063678Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2063929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.2064171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.2064619Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2065014Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2065261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.2065546Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.2065949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2066350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2067100Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2067349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.2067587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.2067991Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2068391Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2069183Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2069440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.2069679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.2070067Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2070474Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2071236Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2071486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.2071726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.2072128Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2072530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2073288Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2073582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.2073823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.2074223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2074603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2074913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.2075374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.2075792Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2076190Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2076941Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2077188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.2077430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.2077836Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2078314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2078557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.2078800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.2079203Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2079601Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2079850Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.2080087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.2080487Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2080887Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2081132Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.2081369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.2081750Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2082152Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2082914Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2083664Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2083911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.2084151Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.2084649Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2085045Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2085287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.2085523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.2085921Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2086299Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2086543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.2086787Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.2087188Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2087584Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2087752Z dist init r=1, world=2 2022-11-23T02:48:20.2087874Z dist init r=0, world=2 2022-11-23T02:48:20.2087977Z ok (6.215s) 2022-11-23T02:48:20.2088315Z test_mixture_of_experts_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86762 2022-11-23T02:48:20.2088543Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86763 2022-11-23T02:48:20.2088933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2089123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2089515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2089712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2090083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2090265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2090649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2090826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2091075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.2091321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.2091731Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2092134Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2092373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.2092606Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.2093634Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2093814Z warnings.warn( 2022-11-23T02:48:20.2094841Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2094955Z warnings.warn( 2022-11-23T02:48:20.2095185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.2095431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.2095836Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2095978Z File "", line 1, in 2022-11-23T02:48:20.2096195Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2096342Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2096551Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2096755Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2096967Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2097075Z self.run() 2022-11-23T02:48:20.2097280Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2097428Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2097785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2097922Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2098300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2098427Z getattr(self, test_name)() 2022-11-23T02:48:20.2098777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2098880Z fn() 2022-11-23T02:48:20.2099256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2099383Z test(self, **param_kwargs) 2022-11-23T02:48:20.2099750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2099882Z return func(*args, **kwargs) 2022-11-23T02:48:20.2100134Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2100251Z self.run_subtests( 2022-11-23T02:48:20.2100600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2100766Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2101137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2101298Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2101682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2101803Z output = model(*input) 2022-11-23T02:48:20.2102137Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2102281Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2102646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2102894Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2103276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2103402Z _lazy_init(state, module) 2022-11-23T02:48:20.2103764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2103914Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2104259Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2104392Z return func(*args, **kwargs) 2022-11-23T02:48:20.2104768Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2104876Z p_assert( 2022-11-23T02:48:20.2105219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2105355Z traceback.print_stack() 2022-11-23T02:48:20.2105762Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2105895Z File "", line 1, in 2022-11-23T02:48:20.2106110Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2106310Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2106511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2106666Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2106883Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2106990Z self.run() 2022-11-23T02:48:20.2107195Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2107345Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2107704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2107823Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2108194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2108324Z getattr(self, test_name)() 2022-11-23T02:48:20.2108693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2108796Z fn() 2022-11-23T02:48:20.2109170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2109297Z test(self, **param_kwargs) 2022-11-23T02:48:20.2109663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2109774Z return func(*args, **kwargs) 2022-11-23T02:48:20.2110034Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2110153Z self.run_subtests( 2022-11-23T02:48:20.2110517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2110683Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2111052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2111211Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2111592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2111699Z output = model(*input) 2022-11-23T02:48:20.2112036Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2112183Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2112633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2112813Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2113183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2113312Z _lazy_init(state, module) 2022-11-23T02:48:20.2113670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2113819Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2114147Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2114276Z return func(*args, **kwargs) 2022-11-23T02:48:20.2114662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2114774Z p_assert( 2022-11-23T02:48:20.2115341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2115482Z traceback.print_stack() 2022-11-23T02:48:20.2115732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.2116047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.2116481Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2116616Z File "", line 1, in 2022-11-23T02:48:20.2116829Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2116974Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2117181Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2117341Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2117559Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2117651Z self.run() 2022-11-23T02:48:20.2117855Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2118005Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2118359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2118498Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2118870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2118999Z getattr(self, test_name)() 2022-11-23T02:48:20.2119368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2119452Z fn() 2022-11-23T02:48:20.2119833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2119959Z test(self, **param_kwargs) 2022-11-23T02:48:20.2120326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2120453Z return func(*args, **kwargs) 2022-11-23T02:48:20.2120710Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2120827Z self.run_subtests( 2022-11-23T02:48:20.2121190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2121342Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2121713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2121868Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2122339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2122462Z output = model(*input) 2022-11-23T02:48:20.2122794Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2122943Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2123329Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2123493Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2123869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2123994Z _lazy_init(state, module) 2022-11-23T02:48:20.2124352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2124501Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2124845Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2124976Z return func(*args, **kwargs) 2022-11-23T02:48:20.2125434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2125536Z p_assert( 2022-11-23T02:48:20.2125885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2126015Z traceback.print_stack() 2022-11-23T02:48:20.2126419Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2126555Z File "", line 1, in 2022-11-23T02:48:20.2126774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2126928Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2127137Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2127274Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2127492Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2127599Z self.run() 2022-11-23T02:48:20.2127810Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2127962Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2128311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2128450Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2128801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2128930Z getattr(self, test_name)() 2022-11-23T02:48:20.2129301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2129403Z fn() 2022-11-23T02:48:20.2129777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2129903Z test(self, **param_kwargs) 2022-11-23T02:48:20.2130269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2130399Z return func(*args, **kwargs) 2022-11-23T02:48:20.2130634Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2130751Z self.run_subtests( 2022-11-23T02:48:20.2131112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2131283Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2131727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2131883Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2132268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2132390Z output = model(*input) 2022-11-23T02:48:20.2132708Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2132853Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2133238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2133418Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2133793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2133922Z _lazy_init(state, module) 2022-11-23T02:48:20.2134281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2134427Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2134772Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2134936Z return func(*args, **kwargs) 2022-11-23T02:48:20.2135337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2135444Z p_assert( 2022-11-23T02:48:20.2135782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2135911Z traceback.print_stack() 2022-11-23T02:48:20.2136163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.2136413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.2136833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2137222Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2137361Z File "", line 1, in 2022-11-23T02:48:20.2137577Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2137723Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2137932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2138086Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2138305Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2138395Z self.run() 2022-11-23T02:48:20.2138601Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2138754Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2139103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2139240Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2139617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2139750Z getattr(self, test_name)() 2022-11-23T02:48:20.2140117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2140201Z fn() 2022-11-23T02:48:20.2140573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2140698Z test(self, **param_kwargs) 2022-11-23T02:48:20.2141061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2141252Z return func(*args, **kwargs) 2022-11-23T02:48:20.2141504Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2141619Z self.run_subtests( 2022-11-23T02:48:20.2141986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2142138Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2142508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2142663Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2143044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2143171Z output = model(*input) 2022-11-23T02:48:20.2143504Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2143654Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2144038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2144199Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2144625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2144761Z _lazy_init(state, module) 2022-11-23T02:48:20.2145123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2145270Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2145612Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2145739Z return func(*args, **kwargs) 2022-11-23T02:48:20.2146135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2146228Z p_assert( 2022-11-23T02:48:20.2146573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2146703Z traceback.print_stack() 2022-11-23T02:48:20.2146841Z File "", line 1, in 2022-11-23T02:48:20.2147058Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2147204Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2147410Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2147562Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2147762Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2147868Z self.run() 2022-11-23T02:48:20.2148080Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2148229Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2148579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2148715Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2149088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2149200Z getattr(self, test_name)() 2022-11-23T02:48:20.2149569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2149670Z fn() 2022-11-23T02:48:20.2150041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2150167Z test(self, **param_kwargs) 2022-11-23T02:48:20.2150527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2150736Z return func(*args, **kwargs) 2022-11-23T02:48:20.2150991Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2151091Z self.run_subtests( 2022-11-23T02:48:20.2151459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2151628Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2151999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2152154Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2152536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2152660Z output = model(*input) 2022-11-23T02:48:20.2152999Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2153126Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2153510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2153691Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2154119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2154254Z _lazy_init(state, module) 2022-11-23T02:48:20.2154615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2154760Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2155319Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2155459Z return func(*args, **kwargs) 2022-11-23T02:48:20.2155844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2155950Z p_assert( 2022-11-23T02:48:20.2156291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2156420Z traceback.print_stack() 2022-11-23T02:48:20.2156678Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.2156927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.2157338Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2157473Z File "", line 1, in 2022-11-23T02:48:20.2157672Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2157823Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2158030Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2158183Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2158399Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2158507Z self.run() 2022-11-23T02:48:20.2158720Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2158854Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2159202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2159338Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2159709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2159834Z getattr(self, test_name)() 2022-11-23T02:48:20.2160198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2160392Z fn() 2022-11-23T02:48:20.2160769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2160877Z test(self, **param_kwargs) 2022-11-23T02:48:20.2161240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2161367Z return func(*args, **kwargs) 2022-11-23T02:48:20.2161620Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2161737Z self.run_subtests( 2022-11-23T02:48:20.2162096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2162263Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2162636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2162776Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2163159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2163284Z output = model(*input) 2022-11-23T02:48:20.2163679Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2163836Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2164223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2164401Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2164779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2164891Z _lazy_init(state, module) 2022-11-23T02:48:20.2165251Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2165396Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2165746Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2165879Z return func(*args, **kwargs) 2022-11-23T02:48:20.2166268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2166374Z p_assert( 2022-11-23T02:48:20.2166717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2166830Z traceback.print_stack() 2022-11-23T02:48:20.2167239Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2167376Z File "", line 1, in 2022-11-23T02:48:20.2167593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2167739Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2167946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2168100Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2168318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2168408Z self.run() 2022-11-23T02:48:20.2168614Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2168762Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2169114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2169250Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2169625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2169818Z getattr(self, test_name)() 2022-11-23T02:48:20.2170193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2170276Z fn() 2022-11-23T02:48:20.2170650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2170778Z test(self, **param_kwargs) 2022-11-23T02:48:20.2171140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2171269Z return func(*args, **kwargs) 2022-11-23T02:48:20.2171520Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2171636Z self.run_subtests( 2022-11-23T02:48:20.2171981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2172152Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2172527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2172743Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2173185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2173318Z output = model(*input) 2022-11-23T02:48:20.2173695Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2173839Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2174205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2174386Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2174770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2174897Z _lazy_init(state, module) 2022-11-23T02:48:20.2175255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2175401Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2175752Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2175886Z return func(*args, **kwargs) 2022-11-23T02:48:20.2176278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2176367Z p_assert( 2022-11-23T02:48:20.2176711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2176840Z traceback.print_stack() 2022-11-23T02:48:20.2177097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.2177348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.2177755Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2177892Z File "", line 1, in 2022-11-23T02:48:20.2178106Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2178234Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2178439Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2178593Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2178811Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2178917Z self.run() 2022-11-23T02:48:20.2179185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2179337Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2179671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2179807Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2180175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2180302Z getattr(self, test_name)() 2022-11-23T02:48:20.2180669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2180771Z fn() 2022-11-23T02:48:20.2181147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2181274Z test(self, **param_kwargs) 2022-11-23T02:48:20.2181623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2181755Z return func(*args, **kwargs) 2022-11-23T02:48:20.2182005Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2182120Z self.run_subtests( 2022-11-23T02:48:20.2182530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2182710Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2183085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2183241Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2183605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2183728Z output = model(*input) 2022-11-23T02:48:20.2184066Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2184211Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2184593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2184774Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2185149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2185275Z _lazy_init(state, module) 2022-11-23T02:48:20.2185613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2185760Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2186104Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2186231Z return func(*args, **kwargs) 2022-11-23T02:48:20.2186620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2186726Z p_assert( 2022-11-23T02:48:20.2187065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2187195Z traceback.print_stack() 2022-11-23T02:48:20.2187587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2187724Z File "", line 1, in 2022-11-23T02:48:20.2187937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2188083Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2188288Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2188444Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2188789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2188895Z self.run() 2022-11-23T02:48:20.2189083Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2189229Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2189585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2189721Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2190088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2190214Z getattr(self, test_name)() 2022-11-23T02:48:20.2190579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2190665Z fn() 2022-11-23T02:48:20.2191039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2191168Z test(self, **param_kwargs) 2022-11-23T02:48:20.2191529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2191657Z return func(*args, **kwargs) 2022-11-23T02:48:20.2191961Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2192086Z self.run_subtests( 2022-11-23T02:48:20.2192454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2192601Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2192969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2193126Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2193509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2193641Z output = model(*input) 2022-11-23T02:48:20.2193971Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2194115Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2194504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2194684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2195282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2195429Z _lazy_init(state, module) 2022-11-23T02:48:20.2195794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2195940Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2196289Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2196418Z return func(*args, **kwargs) 2022-11-23T02:48:20.2196810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2196917Z p_assert( 2022-11-23T02:48:20.2197247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2197379Z traceback.print_stack() 2022-11-23T02:48:20.2197630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.2197911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.2198320Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2198548Z File "", line 1, in 2022-11-23T02:48:20.2198763Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2198911Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2199102Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2199254Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2199479Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2199586Z self.run() 2022-11-23T02:48:20.2199795Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2199946Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2200299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2200422Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2200790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2200919Z getattr(self, test_name)() 2022-11-23T02:48:20.2201286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2201387Z fn() 2022-11-23T02:48:20.2201819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2201956Z test(self, **param_kwargs) 2022-11-23T02:48:20.2202323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2202435Z return func(*args, **kwargs) 2022-11-23T02:48:20.2202687Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2202801Z self.run_subtests( 2022-11-23T02:48:20.2203169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2203343Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2203717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2203873Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2204260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2204367Z output = model(*input) 2022-11-23T02:48:20.2204698Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2204841Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2205222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2205403Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2205783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2205911Z _lazy_init(state, module) 2022-11-23T02:48:20.2206266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2206395Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2206744Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2206873Z return func(*args, **kwargs) 2022-11-23T02:48:20.2207262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2207366Z p_assert( 2022-11-23T02:48:20.2207707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2207836Z traceback.print_stack() 2022-11-23T02:48:20.2208313Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2208429Z File "", line 1, in 2022-11-23T02:48:20.2208644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2208791Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2209001Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2209155Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2209372Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2209479Z self.run() 2022-11-23T02:48:20.2209684Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2209817Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2210168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2210310Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2210681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2210805Z getattr(self, test_name)() 2022-11-23T02:48:20.2211217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2211329Z fn() 2022-11-23T02:48:20.2211687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2211815Z test(self, **param_kwargs) 2022-11-23T02:48:20.2212173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2212301Z return func(*args, **kwargs) 2022-11-23T02:48:20.2212551Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2212672Z self.run_subtests( 2022-11-23T02:48:20.2213037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2213207Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2213568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2213725Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2214108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2214230Z output = model(*input) 2022-11-23T02:48:20.2214563Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2214707Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2215091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2215279Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2215657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2215766Z _lazy_init(state, module) 2022-11-23T02:48:20.2216126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2216275Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2216621Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2216749Z return func(*args, **kwargs) 2022-11-23T02:48:20.2217131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2217238Z p_assert( 2022-11-23T02:48:20.2217664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2217778Z traceback.print_stack() 2022-11-23T02:48:20.2218028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.2218279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.2218691Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2218825Z File "", line 1, in 2022-11-23T02:48:20.2219042Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2219188Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2219396Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2219533Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2219753Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2219864Z self.run() 2022-11-23T02:48:20.2220069Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2220219Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2220621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2220770Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2221128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2221255Z getattr(self, test_name)() 2022-11-23T02:48:20.2221623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2221724Z fn() 2022-11-23T02:48:20.2222095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2222230Z test(self, **param_kwargs) 2022-11-23T02:48:20.2222595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2222723Z return func(*args, **kwargs) 2022-11-23T02:48:20.2222963Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2223078Z self.run_subtests( 2022-11-23T02:48:20.2223443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2223610Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2223981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2224139Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2224527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2224655Z output = model(*input) 2022-11-23T02:48:20.2224973Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2225116Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2225501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2225681Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2226054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2226179Z _lazy_init(state, module) 2022-11-23T02:48:20.2226538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2226687Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2227083Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2227213Z return func(*args, **kwargs) 2022-11-23T02:48:20.2227596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2227701Z p_assert( 2022-11-23T02:48:20.2228049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2228180Z traceback.print_stack() 2022-11-23T02:48:20.2228584Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2228718Z File "", line 1, in 2022-11-23T02:48:20.2228915Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2229059Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2229270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2229422Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2229636Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2229743Z self.run() 2022-11-23T02:48:20.2229946Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2230151Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2230496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2230634Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2231001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2231126Z getattr(self, test_name)() 2022-11-23T02:48:20.2231490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2231597Z fn() 2022-11-23T02:48:20.2231970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2232081Z test(self, **param_kwargs) 2022-11-23T02:48:20.2232445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2232579Z return func(*args, **kwargs) 2022-11-23T02:48:20.2232831Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2232948Z self.run_subtests( 2022-11-23T02:48:20.2233307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2233473Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2233844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2233988Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2234372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2234496Z output = model(*input) 2022-11-23T02:48:20.2234834Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2234979Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2235601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2235784Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2236159Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2236285Z _lazy_init(state, module) 2022-11-23T02:48:20.2236627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2236871Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2237223Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2237351Z return func(*args, **kwargs) 2022-11-23T02:48:20.2237746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2237854Z p_assert( 2022-11-23T02:48:20.2238199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2238330Z traceback.print_stack() 2022-11-23T02:48:20.2238561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.2238813Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.2239226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2239361Z File "", line 1, in 2022-11-23T02:48:20.2239576Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2239721Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2239992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2240163Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2240363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2240468Z self.run() 2022-11-23T02:48:20.2240672Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2240821Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2241171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2241312Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2241680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2241793Z getattr(self, test_name)() 2022-11-23T02:48:20.2242159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2242263Z fn() 2022-11-23T02:48:20.2242640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2242769Z test(self, **param_kwargs) 2022-11-23T02:48:20.2243136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2243266Z return func(*args, **kwargs) 2022-11-23T02:48:20.2243517Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2243623Z self.run_subtests( 2022-11-23T02:48:20.2243988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2244159Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2244532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2244691Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2245071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2245194Z output = model(*input) 2022-11-23T02:48:20.2245527Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2245656Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2246042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2246289Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2246665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2246789Z _lazy_init(state, module) 2022-11-23T02:48:20.2247150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2247297Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2247641Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2247754Z return func(*args, **kwargs) 2022-11-23T02:48:20.2248142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2248245Z p_assert( 2022-11-23T02:48:20.2248590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2248725Z traceback.print_stack() 2022-11-23T02:48:20.2249133Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2249266Z File "", line 1, in 2022-11-23T02:48:20.2249529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2249669Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2249874Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2250028Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2250245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2250352Z self.run() 2022-11-23T02:48:20.2250561Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2250716Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2251068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2251188Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2251557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2251688Z getattr(self, test_name)() 2022-11-23T02:48:20.2252055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2252156Z fn() 2022-11-23T02:48:20.2252526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2252653Z test(self, **param_kwargs) 2022-11-23T02:48:20.2252996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2253131Z return func(*args, **kwargs) 2022-11-23T02:48:20.2253381Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2253497Z self.run_subtests( 2022-11-23T02:48:20.2253859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2254029Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2254401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2254558Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2254942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2255048Z output = model(*input) 2022-11-23T02:48:20.2255378Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2255586Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2255974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2256152Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2256529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2256654Z _lazy_init(state, module) 2022-11-23T02:48:20.2257010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2257139Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2257482Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2257610Z return func(*args, **kwargs) 2022-11-23T02:48:20.2257994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2258105Z p_assert( 2022-11-23T02:48:20.2258448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2258577Z traceback.print_stack() 2022-11-23T02:48:20.2258863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.2259121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.2259534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2259667Z File "", line 1, in 2022-11-23T02:48:20.2259882Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2260026Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2260242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2260397Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2260597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2260705Z self.run() 2022-11-23T02:48:20.2260911Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2261063Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2261415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2261551Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2261917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2262043Z getattr(self, test_name)() 2022-11-23T02:48:20.2262392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2262497Z fn() 2022-11-23T02:48:20.2262866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2262993Z test(self, **param_kwargs) 2022-11-23T02:48:20.2263352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2263485Z return func(*args, **kwargs) 2022-11-23T02:48:20.2263739Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2263856Z self.run_subtests( 2022-11-23T02:48:20.2264197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2264364Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2264735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2264954Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2265343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2265465Z output = model(*input) 2022-11-23T02:48:20.2265798Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2265947Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2266314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2266492Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2266865Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2266991Z _lazy_init(state, module) 2022-11-23T02:48:20.2267349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2267499Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2267848Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2267976Z return func(*args, **kwargs) 2022-11-23T02:48:20.2268394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2268512Z p_assert( 2022-11-23T02:48:20.2268856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2268984Z traceback.print_stack() 2022-11-23T02:48:20.2269393Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2269528Z File "", line 1, in 2022-11-23T02:48:20.2269742Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2269895Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2270085Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2270240Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2270459Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2270567Z self.run() 2022-11-23T02:48:20.2270773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2270923Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2271274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2271411Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2271764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2271896Z getattr(self, test_name)() 2022-11-23T02:48:20.2272263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2272365Z fn() 2022-11-23T02:48:20.2272736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2272866Z test(self, **param_kwargs) 2022-11-23T02:48:20.2273233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2273361Z return func(*args, **kwargs) 2022-11-23T02:48:20.2273643Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2273762Z self.run_subtests( 2022-11-23T02:48:20.2274125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2274292Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2274730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2274886Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2275486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2275617Z output = model(*input) 2022-11-23T02:48:20.2275941Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2276085Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2276467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2276647Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2277020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2277150Z _lazy_init(state, module) 2022-11-23T02:48:20.2277507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2277653Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2278063Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2278209Z return func(*args, **kwargs) 2022-11-23T02:48:20.2278598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2278705Z p_assert( 2022-11-23T02:48:20.2279048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2279179Z traceback.print_stack() 2022-11-23T02:48:20.2279429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.2279677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.2280071Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2280480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2280614Z File "", line 1, in 2022-11-23T02:48:20.2280832Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2280977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2281184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2281339Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2281557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2281650Z self.run() 2022-11-23T02:48:20.2281855Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2282004Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2282354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2282490Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2282861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2282988Z getattr(self, test_name)() 2022-11-23T02:48:20.2283339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2283440Z fn() 2022-11-23T02:48:20.2283813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2283940Z test(self, **param_kwargs) 2022-11-23T02:48:20.2284400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2284530Z return func(*args, **kwargs) 2022-11-23T02:48:20.2284780Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2284896Z self.run_subtests( 2022-11-23T02:48:20.2285243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2285411Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2285785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2285941Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2286324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2286451Z output = model(*input) 2022-11-23T02:48:20.2286786Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2286930Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2287299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2287534Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2287925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2288054Z _lazy_init(state, module) 2022-11-23T02:48:20.2288410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2288557Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2288899Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2289033Z return func(*args, **kwargs) 2022-11-23T02:48:20.2289422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2289511Z p_assert( 2022-11-23T02:48:20.2289854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2289988Z traceback.print_stack() 2022-11-23T02:48:20.2290121Z File "", line 1, in 2022-11-23T02:48:20.2290335Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2290481Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2290686Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2290824Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2291041Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2291154Z self.run() 2022-11-23T02:48:20.2291359Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2291510Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2291861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2292001Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2292375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2292485Z getattr(self, test_name)() 2022-11-23T02:48:20.2292855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2292960Z fn() 2022-11-23T02:48:20.2293337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2293463Z test(self, **param_kwargs) 2022-11-23T02:48:20.2293894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2294022Z return func(*args, **kwargs) 2022-11-23T02:48:20.2294275Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2294377Z self.run_subtests( 2022-11-23T02:48:20.2294740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2294908Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2295277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2295433Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2295815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2295944Z output = model(*input) 2022-11-23T02:48:20.2296278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2296406Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2296791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2297020Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2297406Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2297529Z _lazy_init(state, module) 2022-11-23T02:48:20.2297885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2298031Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2298373Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2298494Z return func(*args, **kwargs) 2022-11-23T02:48:20.2298884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2298991Z p_assert( 2022-11-23T02:48:20.2299340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2299471Z traceback.print_stack() 2022-11-23T02:48:20.2299721Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.2299962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.2300375Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2300492Z File "", line 1, in 2022-11-23T02:48:20.2300706Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2300857Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2301063Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2301216Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2301435Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2301543Z self.run() 2022-11-23T02:48:20.2301730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2301879Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2302229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2302366Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2302732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2302921Z getattr(self, test_name)() 2022-11-23T02:48:20.2303291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2303390Z fn() 2022-11-23T02:48:20.2303742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2303874Z test(self, **param_kwargs) 2022-11-23T02:48:20.2304244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2304375Z return func(*args, **kwargs) 2022-11-23T02:48:20.2304627Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2304745Z self.run_subtests( 2022-11-23T02:48:20.2305104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2305276Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2305631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2305786Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2306219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2306356Z output = model(*input) 2022-11-23T02:48:20.2306692Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2306836Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2307222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2307403Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2307760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2307892Z _lazy_init(state, module) 2022-11-23T02:48:20.2308257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2308407Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2308756Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2308884Z return func(*args, **kwargs) 2022-11-23T02:48:20.2309269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2309375Z p_assert( 2022-11-23T02:48:20.2309701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2309831Z traceback.print_stack() 2022-11-23T02:48:20.2310238Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2310377Z File "", line 1, in 2022-11-23T02:48:20.2310593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2310736Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2310943Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2311100Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2311302Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2311407Z self.run() 2022-11-23T02:48:20.2311612Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2311760Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2312107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2312243Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2312685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2312811Z getattr(self, test_name)() 2022-11-23T02:48:20.2313163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2313265Z fn() 2022-11-23T02:48:20.2313638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2313770Z test(self, **param_kwargs) 2022-11-23T02:48:20.2314128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2314255Z return func(*args, **kwargs) 2022-11-23T02:48:20.2314507Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2314625Z self.run_subtests( 2022-11-23T02:48:20.2314974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2315366Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2315750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2315981Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2316382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2316505Z output = model(*input) 2022-11-23T02:48:20.2316840Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2316982Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2317346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2317534Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2317909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2318035Z _lazy_init(state, module) 2022-11-23T02:48:20.2318397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2318543Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2318887Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2319013Z return func(*args, **kwargs) 2022-11-23T02:48:20.2319380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2319486Z p_assert( 2022-11-23T02:48:20.2319826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2319958Z traceback.print_stack() 2022-11-23T02:48:20.2320209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.2320452Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.2320867Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2321006Z File "", line 1, in 2022-11-23T02:48:20.2321205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2321351Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2321557Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2321710Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2321927Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2322115Z self.run() 2022-11-23T02:48:20.2322324Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2322454Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2322807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2322948Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2323320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2323451Z getattr(self, test_name)() 2022-11-23T02:48:20.2323818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2323919Z fn() 2022-11-23T02:48:20.2324290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2324401Z test(self, **param_kwargs) 2022-11-23T02:48:20.2324772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2324900Z return func(*args, **kwargs) 2022-11-23T02:48:20.2325154Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2325271Z self.run_subtests( 2022-11-23T02:48:20.2325683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2325863Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2326239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2326379Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2326761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2326893Z output = model(*input) 2022-11-23T02:48:20.2327232Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2327376Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2327761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2327945Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2328327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2328434Z _lazy_init(state, module) 2022-11-23T02:48:20.2328796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2328942Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2329289Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2329425Z return func(*args, **kwargs) 2022-11-23T02:48:20.2329815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2329921Z p_assert( 2022-11-23T02:48:20.2330269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2330382Z traceback.print_stack() 2022-11-23T02:48:20.2330794Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2330928Z File "", line 1, in 2022-11-23T02:48:20.2331145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2331294Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2331501Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2331723Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2331941Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2332031Z self.run() 2022-11-23T02:48:20.2332238Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2332387Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2332746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2332884Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2333257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2333387Z getattr(self, test_name)() 2022-11-23T02:48:20.2333751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2333835Z fn() 2022-11-23T02:48:20.2334213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2334340Z test(self, **param_kwargs) 2022-11-23T02:48:20.2334703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2334835Z return func(*args, **kwargs) 2022-11-23T02:48:20.2335196Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2335326Z self.run_subtests( 2022-11-23T02:48:20.2335690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2335839Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2336207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2336364Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2336752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2336877Z output = model(*input) 2022-11-23T02:48:20.2337209Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2337357Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2337743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2337906Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2338281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2338405Z _lazy_init(state, module) 2022-11-23T02:48:20.2338761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2338912Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2339258Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2339388Z return func(*args, **kwargs) 2022-11-23T02:48:20.2339780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2339872Z p_assert( 2022-11-23T02:48:20.2340219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2340349Z traceback.print_stack() 2022-11-23T02:48:20.2340600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.2340844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.2341256Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2341752Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2342002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.2342392Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2342634Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.2343029Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2343275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.2343675Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2343923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.2344323Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2344621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.2344871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.2345275Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2345653Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2346417Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2347182Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2347431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.2347833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2348079Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.2348481Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2348733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.2348974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.2349376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2349774Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2350001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.2350236Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.2350635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2351110Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2351355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.2351758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2351999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.2352396Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2352641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.2353020Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2353268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.2353666Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2353962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.2354209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.2354610Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2355003Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2355467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.2355712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.2356115Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2356498Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2356743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.2357142Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2357383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.2357778Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2358027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:48:20.2358267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:48:20.2358664Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.2359064Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.2359294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:48:20.2359532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:48:20.2359932Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.2360327Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.2360661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:48:20.2360899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:48:20.2361304Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.2361703Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.2361947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:48:20.2362324Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.2362565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:48:20.2362969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.2363211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:48:20.2363509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:48:20.2363924Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.2364319Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.2364563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:48:20.2364952Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.2365201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:48:20.2365580Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.2365826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:48:20.2366063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:48:20.2366462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.2366852Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.2367092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:48:20.2367338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:48:20.2367736Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.2368130Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.2368360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:48:20.2368602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:48:20.2369001Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.2369393Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.2369698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:48:20.2369937Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:48:20.2370338Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.2370738Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.2370981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:48:20.2371202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:48:20.2371599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.2371993Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.2372239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:48:20.2372476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:48:20.2372927Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.2373337Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.2373455Z dist init r=1, world=2 2022-11-23T02:48:20.2373834Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2374159Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2374463Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2374797Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2375119Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2375433Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2375767Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2376090Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2376400Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2376731Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2377055Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2377367Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2377482Z dist init r=0, world=2 2022-11-23T02:48:20.2377845Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2378156Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2378470Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2378790Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2379107Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2379414Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2379739Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2380095Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2380418Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2380740Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2381051Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2381365Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2381455Z ok (6.515s) 2022-11-23T02:48:20.2381811Z test_mixture_of_experts_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87109 2022-11-23T02:48:20.2382036Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87110 2022-11-23T02:48:20.2382430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2382612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2383000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2383197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2383573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2383752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2384131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2384326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2384579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.2384830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.2385240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2385644Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2385942Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.2386177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.2387220Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2387343Z warnings.warn( 2022-11-23T02:48:20.2387573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.2388652Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2388781Z warnings.warn( 2022-11-23T02:48:20.2389033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.2389441Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2389845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2389981Z File "", line 1, in 2022-11-23T02:48:20.2390204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2390350Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2390542Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2390695Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2390917Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2391027Z self.run() 2022-11-23T02:48:20.2391234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2391382Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2391736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2391873Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2392228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2392360Z getattr(self, test_name)() 2022-11-23T02:48:20.2392725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2392828Z fn() 2022-11-23T02:48:20.2393202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2393332Z test(self, **param_kwargs) 2022-11-23T02:48:20.2393696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2393825Z return func(*args, **kwargs) 2022-11-23T02:48:20.2394060Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2394177Z self.run_subtests( 2022-11-23T02:48:20.2394540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2394774Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2395376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2395540Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2395934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2396059Z output = model(*input) 2022-11-23T02:48:20.2396376Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2396523Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2396908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2397086Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2397464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2397593Z _lazy_init(state, module) 2022-11-23T02:48:20.2397952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2398099Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2398508Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2398652Z return func(*args, **kwargs) 2022-11-23T02:48:20.2399045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2399154Z p_assert( 2022-11-23T02:48:20.2399500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2399630Z traceback.print_stack() 2022-11-23T02:48:20.2399762Z File "", line 1, in 2022-11-23T02:48:20.2399982Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2400110Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2400316Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2400470Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2400688Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2400797Z self.run() 2022-11-23T02:48:20.2401000Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2401149Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2401484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2401621Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2401987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2402119Z getattr(self, test_name)() 2022-11-23T02:48:20.2402489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2402590Z fn() 2022-11-23T02:48:20.2402966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2403096Z test(self, **param_kwargs) 2022-11-23T02:48:20.2403447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2403576Z return func(*args, **kwargs) 2022-11-23T02:48:20.2403828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2403945Z self.run_subtests( 2022-11-23T02:48:20.2404306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2404556Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2404935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2405095Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2405464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2405588Z output = model(*input) 2022-11-23T02:48:20.2405920Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2406065Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2406453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2406633Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2407014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2407138Z _lazy_init(state, module) 2022-11-23T02:48:20.2407482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2407628Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2408024Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2408163Z return func(*args, **kwargs) 2022-11-23T02:48:20.2408550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2408658Z p_assert( 2022-11-23T02:48:20.2409004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2409134Z traceback.print_stack() 2022-11-23T02:48:20.2409366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.2409622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.2410034Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2410169Z File "", line 1, in 2022-11-23T02:48:20.2410388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2410534Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2410740Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2410893Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2411094Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2411201Z self.run() 2022-11-23T02:48:20.2411404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2411558Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2411907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2412044Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2412417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2412545Z getattr(self, test_name)() 2022-11-23T02:48:20.2412895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2412999Z fn() 2022-11-23T02:48:20.2413377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2413509Z test(self, **param_kwargs) 2022-11-23T02:48:20.2413878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2414082Z return func(*args, **kwargs) 2022-11-23T02:48:20.2414337Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2414454Z self.run_subtests( 2022-11-23T02:48:20.2414807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2414978Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2415351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2415506Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2415889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2416013Z output = model(*input) 2022-11-23T02:48:20.2416344Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2416492Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2416863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2417043Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2417466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2417602Z _lazy_init(state, module) 2022-11-23T02:48:20.2417966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2418111Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2418455Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2418583Z return func(*args, **kwargs) 2022-11-23T02:48:20.2418957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2419061Z p_assert( 2022-11-23T02:48:20.2419405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2419535Z traceback.print_stack() 2022-11-23T02:48:20.2419948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2420083Z File "", line 1, in 2022-11-23T02:48:20.2420300Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2420448Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2420638Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2420790Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2421007Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2421120Z self.run() 2022-11-23T02:48:20.2421326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2421475Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2421826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2421952Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2422324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2422453Z getattr(self, test_name)() 2022-11-23T02:48:20.2422818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2422920Z fn() 2022-11-23T02:48:20.2423290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2423489Z test(self, **param_kwargs) 2022-11-23T02:48:20.2423856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2423968Z return func(*args, **kwargs) 2022-11-23T02:48:20.2424217Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2424337Z self.run_subtests( 2022-11-23T02:48:20.2424701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2424868Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2425238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2425397Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2425781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2425892Z output = model(*input) 2022-11-23T02:48:20.2426225Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2426367Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2426805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2426999Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2427374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2427497Z _lazy_init(state, module) 2022-11-23T02:48:20.2427853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2427983Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2428326Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2428458Z return func(*args, **kwargs) 2022-11-23T02:48:20.2428847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2428953Z p_assert( 2022-11-23T02:48:20.2429301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2429434Z traceback.print_stack() 2022-11-23T02:48:20.2429683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.2429917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.2430324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2430457Z File "", line 1, in 2022-11-23T02:48:20.2430675Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2430822Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2431028Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2431182Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2431404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2431495Z self.run() 2022-11-23T02:48:20.2431702Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2431849Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2432198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2432334Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2432702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2432898Z getattr(self, test_name)() 2022-11-23T02:48:20.2433270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2433353Z fn() 2022-11-23T02:48:20.2433725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2433856Z test(self, **param_kwargs) 2022-11-23T02:48:20.2434217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2434346Z return func(*args, **kwargs) 2022-11-23T02:48:20.2434597Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2434712Z self.run_subtests( 2022-11-23T02:48:20.2435283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2435452Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2435830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2435987Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2436447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2436582Z output = model(*input) 2022-11-23T02:48:20.2436917Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2437061Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2437447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2437613Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2437989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2438120Z _lazy_init(state, module) 2022-11-23T02:48:20.2438480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2438631Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2438978Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2439106Z return func(*args, **kwargs) 2022-11-23T02:48:20.2439494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2439584Z p_assert( 2022-11-23T02:48:20.2439930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2440060Z traceback.print_stack() 2022-11-23T02:48:20.2440468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2440607Z File "", line 1, in 2022-11-23T02:48:20.2440822Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2440967Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2441174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2441311Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2441526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2441633Z self.run() 2022-11-23T02:48:20.2441837Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2441987Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2442337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2442551Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2442909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2443035Z getattr(self, test_name)() 2022-11-23T02:48:20.2443401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2443506Z fn() 2022-11-23T02:48:20.2443884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2444011Z test(self, **param_kwargs) 2022-11-23T02:48:20.2444376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2444506Z return func(*args, **kwargs) 2022-11-23T02:48:20.2444742Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2444863Z self.run_subtests( 2022-11-23T02:48:20.2445223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2445390Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2445758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2445966Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2446362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2446485Z output = model(*input) 2022-11-23T02:48:20.2446799Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2446943Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2447321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2447509Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2447885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2448012Z _lazy_init(state, module) 2022-11-23T02:48:20.2448372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2448521Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2448849Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2448977Z return func(*args, **kwargs) 2022-11-23T02:48:20.2449363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2449472Z p_assert( 2022-11-23T02:48:20.2449818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2449957Z traceback.print_stack() 2022-11-23T02:48:20.2450208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.2450461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.2450859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2450994Z File "", line 1, in 2022-11-23T02:48:20.2451208Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2451353Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2451558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2451710Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2451926Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2452096Z self.run() 2022-11-23T02:48:20.2452285Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2452434Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2452783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2452925Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2453296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2453423Z getattr(self, test_name)() 2022-11-23T02:48:20.2453789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2453893Z fn() 2022-11-23T02:48:20.2454249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2454380Z test(self, **param_kwargs) 2022-11-23T02:48:20.2454739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2454867Z return func(*args, **kwargs) 2022-11-23T02:48:20.2455122Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2455289Z self.run_subtests( 2022-11-23T02:48:20.2455662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2455831Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2456184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2456343Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2456724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2456853Z output = model(*input) 2022-11-23T02:48:20.2457189Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2457336Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2457722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2457903Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2458261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2458386Z _lazy_init(state, module) 2022-11-23T02:48:20.2458746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2458892Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2459241Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2459376Z return func(*args, **kwargs) 2022-11-23T02:48:20.2459765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2459873Z p_assert( 2022-11-23T02:48:20.2460202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2460332Z traceback.print_stack() 2022-11-23T02:48:20.2460739Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2460873Z File "", line 1, in 2022-11-23T02:48:20.2461089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2461233Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2461441Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2461660Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2461859Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2461967Z self.run() 2022-11-23T02:48:20.2462169Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2462325Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2462679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2462815Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2463186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2463299Z getattr(self, test_name)() 2022-11-23T02:48:20.2463664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2463770Z fn() 2022-11-23T02:48:20.2464140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2464266Z test(self, **param_kwargs) 2022-11-23T02:48:20.2464628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2464808Z return func(*args, **kwargs) 2022-11-23T02:48:20.2465074Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2465173Z self.run_subtests( 2022-11-23T02:48:20.2465541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2465710Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2466081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2466241Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2466624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2466746Z output = model(*input) 2022-11-23T02:48:20.2467079Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2467209Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2467595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2467774Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2468147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2468273Z _lazy_init(state, module) 2022-11-23T02:48:20.2468632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2468783Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2469129Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2469241Z return func(*args, **kwargs) 2022-11-23T02:48:20.2469636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2469744Z p_assert( 2022-11-23T02:48:20.2470093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2470225Z traceback.print_stack() 2022-11-23T02:48:20.2470475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.2470725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.2471136Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2471330Z File "", line 1, in 2022-11-23T02:48:20.2471530Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2471675Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2471884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2472040Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2472257Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2472366Z self.run() 2022-11-23T02:48:20.2472569Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2472755Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2473114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2473255Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2473649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2473777Z getattr(self, test_name)() 2022-11-23T02:48:20.2474143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2474299Z fn() 2022-11-23T02:48:20.2474689Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2474799Z test(self, **param_kwargs) 2022-11-23T02:48:20.2475379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2475517Z return func(*args, **kwargs) 2022-11-23T02:48:20.2475770Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2475895Z self.run_subtests( 2022-11-23T02:48:20.2476263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2476431Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2476800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2476946Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2477332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2477456Z output = model(*input) 2022-11-23T02:48:20.2477790Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2477937Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2478323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2478509Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2478884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2478992Z _lazy_init(state, module) 2022-11-23T02:48:20.2479358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2479506Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2479854Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2479983Z return func(*args, **kwargs) 2022-11-23T02:48:20.2480368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2480475Z p_assert( 2022-11-23T02:48:20.2480817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2481097Z traceback.print_stack() 2022-11-23T02:48:20.2481515Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2481651Z File "", line 1, in 2022-11-23T02:48:20.2481873Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2482021Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2482228Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2482382Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2482600Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2482690Z self.run() 2022-11-23T02:48:20.2482895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2483045Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2483401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2483538Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2483907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2484034Z getattr(self, test_name)() 2022-11-23T02:48:20.2484445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2484559Z fn() 2022-11-23T02:48:20.2484938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2485063Z test(self, **param_kwargs) 2022-11-23T02:48:20.2485420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2485549Z return func(*args, **kwargs) 2022-11-23T02:48:20.2485807Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2485923Z self.run_subtests( 2022-11-23T02:48:20.2486264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2486440Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2486812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2486970Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2487354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2487477Z output = model(*input) 2022-11-23T02:48:20.2487809Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2487959Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2488326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2488506Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2488884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2489009Z _lazy_init(state, module) 2022-11-23T02:48:20.2489366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2489510Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2489852Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2489980Z return func(*args, **kwargs) 2022-11-23T02:48:20.2490345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2490524Z p_assert( 2022-11-23T02:48:20.2490871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2491000Z traceback.print_stack() 2022-11-23T02:48:20.2491249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.2491503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.2491910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2492046Z File "", line 1, in 2022-11-23T02:48:20.2492246Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2492390Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2492603Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2492763Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2492981Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2493089Z self.run() 2022-11-23T02:48:20.2493297Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2493497Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2493861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2493981Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2494350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2494480Z getattr(self, test_name)() 2022-11-23T02:48:20.2494846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2494956Z fn() 2022-11-23T02:48:20.2495332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2495459Z test(self, **param_kwargs) 2022-11-23T02:48:20.2495823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2495940Z return func(*args, **kwargs) 2022-11-23T02:48:20.2496196Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2496316Z self.run_subtests( 2022-11-23T02:48:20.2496680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2496848Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2497220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2497385Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2497775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2497881Z output = model(*input) 2022-11-23T02:48:20.2498216Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2498365Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2498751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2498933Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2499308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2499436Z _lazy_init(state, module) 2022-11-23T02:48:20.2499794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2499989Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2500338Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2500468Z return func(*args, **kwargs) 2022-11-23T02:48:20.2500859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2500969Z p_assert( 2022-11-23T02:48:20.2501315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2501446Z traceback.print_stack() 2022-11-23T02:48:20.2501856Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2501972Z File "", line 1, in 2022-11-23T02:48:20.2502188Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2502339Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2502541Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2502694Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2502911Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2503072Z self.run() 2022-11-23T02:48:20.2503290Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2503422Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2503776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2503914Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2504281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2504410Z getattr(self, test_name)() 2022-11-23T02:48:20.2504785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2504890Z fn() 2022-11-23T02:48:20.2505247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2505376Z test(self, **param_kwargs) 2022-11-23T02:48:20.2505742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2505870Z return func(*args, **kwargs) 2022-11-23T02:48:20.2506124Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2506242Z self.run_subtests( 2022-11-23T02:48:20.2506602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2506772Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2507134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2507294Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2507678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2507804Z output = model(*input) 2022-11-23T02:48:20.2508139Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2508289Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2508669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2508849Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2509229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2509401Z _lazy_init(state, module) 2022-11-23T02:48:20.2509764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2509908Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2510253Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2510386Z return func(*args, **kwargs) 2022-11-23T02:48:20.2510775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2510880Z p_assert( 2022-11-23T02:48:20.2511205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2511337Z traceback.print_stack() 2022-11-23T02:48:20.2511587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.2511844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.2512254Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2512391Z File "", line 1, in 2022-11-23T02:48:20.2512661Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2512817Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2513005Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2513158Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2513376Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2513485Z self.run() 2022-11-23T02:48:20.2513692Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2513842Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2514198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2514337Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2514690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2514819Z getattr(self, test_name)() 2022-11-23T02:48:20.2515414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2515525Z fn() 2022-11-23T02:48:20.2515906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2516034Z test(self, **param_kwargs) 2022-11-23T02:48:20.2516396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2516525Z return func(*args, **kwargs) 2022-11-23T02:48:20.2516764Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2516882Z self.run_subtests( 2022-11-23T02:48:20.2517244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2517414Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2517790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2517950Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2518333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2518457Z output = model(*input) 2022-11-23T02:48:20.2518772Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2519014Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2519401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2519581Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2519960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2520087Z _lazy_init(state, module) 2022-11-23T02:48:20.2520447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2520594Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2520924Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2521140Z return func(*args, **kwargs) 2022-11-23T02:48:20.2521526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2521640Z p_assert( 2022-11-23T02:48:20.2521986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2522117Z traceback.print_stack() 2022-11-23T02:48:20.2522586Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2522732Z File "", line 1, in 2022-11-23T02:48:20.2522932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2523077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2523284Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2523442Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2523660Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2523774Z self.run() 2022-11-23T02:48:20.2523982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2524133Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2524468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2524602Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2524977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2525107Z getattr(self, test_name)() 2022-11-23T02:48:20.2525472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2525577Z fn() 2022-11-23T02:48:20.2525950Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2526060Z test(self, **param_kwargs) 2022-11-23T02:48:20.2526434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2526564Z return func(*args, **kwargs) 2022-11-23T02:48:20.2526822Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2526942Z self.run_subtests( 2022-11-23T02:48:20.2527308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2527479Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2527854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2527992Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2528376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2528502Z output = model(*input) 2022-11-23T02:48:20.2528900Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2529047Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2529429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2529616Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2529990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2530115Z _lazy_init(state, module) 2022-11-23T02:48:20.2530456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2530605Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2530954Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2531091Z return func(*args, **kwargs) 2022-11-23T02:48:20.2531482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2531590Z p_assert( 2022-11-23T02:48:20.2531935Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2532099Z traceback.print_stack() 2022-11-23T02:48:20.2532363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.2532611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.2533023Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2533159Z File "", line 1, in 2022-11-23T02:48:20.2533376Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2533530Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2533738Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2533876Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2534097Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2534209Z self.run() 2022-11-23T02:48:20.2534415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2534568Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2534917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2535055Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2535430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2535540Z getattr(self, test_name)() 2022-11-23T02:48:20.2535911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2536015Z fn() 2022-11-23T02:48:20.2536386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2536514Z test(self, **param_kwargs) 2022-11-23T02:48:20.2536885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2537018Z return func(*args, **kwargs) 2022-11-23T02:48:20.2537273Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2537373Z self.run_subtests( 2022-11-23T02:48:20.2537736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2537903Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2538344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2538498Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2538885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2539015Z output = model(*input) 2022-11-23T02:48:20.2539348Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2539475Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2539858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2540042Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2540419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2540546Z _lazy_init(state, module) 2022-11-23T02:48:20.2540906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2541053Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2541400Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2541562Z return func(*args, **kwargs) 2022-11-23T02:48:20.2541965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2542068Z p_assert( 2022-11-23T02:48:20.2542412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2542541Z traceback.print_stack() 2022-11-23T02:48:20.2542949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2543089Z File "", line 1, in 2022-11-23T02:48:20.2543306Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2543436Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2543647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2543807Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2544027Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2544140Z self.run() 2022-11-23T02:48:20.2544348Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2544498Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2544852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2544971Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2545345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2545475Z getattr(self, test_name)() 2022-11-23T02:48:20.2545843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2545948Z fn() 2022-11-23T02:48:20.2546324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2546454Z test(self, **param_kwargs) 2022-11-23T02:48:20.2546804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2546936Z return func(*args, **kwargs) 2022-11-23T02:48:20.2547189Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2547307Z self.run_subtests( 2022-11-23T02:48:20.2547672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2547915Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2548291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2548449Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2548822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2548949Z output = model(*input) 2022-11-23T02:48:20.2549283Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2549428Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2549815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2550000Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2550380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2550508Z _lazy_init(state, module) 2022-11-23T02:48:20.2550869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2551049Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2551412Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2551539Z return func(*args, **kwargs) 2022-11-23T02:48:20.2551923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2552032Z p_assert( 2022-11-23T02:48:20.2552377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2552513Z traceback.print_stack() 2022-11-23T02:48:20.2552747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.2552990Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.2553409Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2553546Z File "", line 1, in 2022-11-23T02:48:20.2553761Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2553906Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2554112Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2554266Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2554466Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2554581Z self.run() 2022-11-23T02:48:20.2554792Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2554943Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2555522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2555659Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2556039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2556167Z getattr(self, test_name)() 2022-11-23T02:48:20.2556517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2556621Z fn() 2022-11-23T02:48:20.2556992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2557121Z test(self, **param_kwargs) 2022-11-23T02:48:20.2557579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2557709Z return func(*args, **kwargs) 2022-11-23T02:48:20.2557963Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2558081Z self.run_subtests( 2022-11-23T02:48:20.2558431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2558602Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2558978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2559134Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2559517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2559641Z output = model(*input) 2022-11-23T02:48:20.2559979Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2560126Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2560494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2560740Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2561132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2561257Z _lazy_init(state, module) 2022-11-23T02:48:20.2561614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2561760Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2562106Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2562242Z return func(*args, **kwargs) 2022-11-23T02:48:20.2562614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2562722Z p_assert( 2022-11-23T02:48:20.2563065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2563202Z traceback.print_stack() 2022-11-23T02:48:20.2563616Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2563750Z File "", line 1, in 2022-11-23T02:48:20.2563966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2564113Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2564301Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2564455Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2564678Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2564785Z self.run() 2022-11-23T02:48:20.2564993Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2565143Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2565498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2565641Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2565998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2566129Z getattr(self, test_name)() 2022-11-23T02:48:20.2566499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2566604Z fn() 2022-11-23T02:48:20.2566980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2567173Z test(self, **param_kwargs) 2022-11-23T02:48:20.2567540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2567651Z return func(*args, **kwargs) 2022-11-23T02:48:20.2567910Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2568029Z self.run_subtests( 2022-11-23T02:48:20.2568392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2568562Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2568934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2569093Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2569480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2569610Z output = model(*input) 2022-11-23T02:48:20.2569927Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2570074Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2570513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2570708Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2571080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2571206Z _lazy_init(state, module) 2022-11-23T02:48:20.2571563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2571716Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2572050Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2572178Z return func(*args, **kwargs) 2022-11-23T02:48:20.2572570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2572682Z p_assert( 2022-11-23T02:48:20.2573028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2573158Z traceback.print_stack() 2022-11-23T02:48:20.2573406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.2573689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.2574094Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2574232Z File "", line 1, in 2022-11-23T02:48:20.2574448Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2574595Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2574802Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2574960Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2575178Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2575268Z self.run() 2022-11-23T02:48:20.2575478Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2575626Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2575977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2576115Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2576555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2576684Z getattr(self, test_name)() 2022-11-23T02:48:20.2577052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2577135Z fn() 2022-11-23T02:48:20.2577514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2577644Z test(self, **param_kwargs) 2022-11-23T02:48:20.2578008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2578138Z return func(*args, **kwargs) 2022-11-23T02:48:20.2578393Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2578516Z self.run_subtests( 2022-11-23T02:48:20.2578876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2579030Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2579404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2579565Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2580001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2580134Z output = model(*input) 2022-11-23T02:48:20.2580470Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2580617Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2581001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2581165Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2581550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2581678Z _lazy_init(state, module) 2022-11-23T02:48:20.2582038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2582189Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2582536Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2582665Z return func(*args, **kwargs) 2022-11-23T02:48:20.2583053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2583142Z p_assert( 2022-11-23T02:48:20.2583487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2583624Z traceback.print_stack() 2022-11-23T02:48:20.2584036Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2584170Z File "", line 1, in 2022-11-23T02:48:20.2584385Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2584536Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2584745Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2584880Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2585097Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2585204Z self.run() 2022-11-23T02:48:20.2585409Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2585558Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2585904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2586102Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2586476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2586586Z getattr(self, test_name)() 2022-11-23T02:48:20.2586958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2587065Z fn() 2022-11-23T02:48:20.2587444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2587572Z test(self, **param_kwargs) 2022-11-23T02:48:20.2587941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2588072Z return func(*args, **kwargs) 2022-11-23T02:48:20.2588307Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2588431Z self.run_subtests( 2022-11-23T02:48:20.2588794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2588961Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2589386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2589553Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2589938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2590063Z output = model(*input) 2022-11-23T02:48:20.2590397Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2590524Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2590918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2591102Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2591478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2591604Z _lazy_init(state, module) 2022-11-23T02:48:20.2591966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2592116Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2592464Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2592579Z return func(*args, **kwargs) 2022-11-23T02:48:20.2592967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2593080Z p_assert( 2022-11-23T02:48:20.2593426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2593557Z traceback.print_stack() 2022-11-23T02:48:20.2593809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.2594058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.2594470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2594587Z File "", line 1, in 2022-11-23T02:48:20.2594807Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2594954Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2595383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2595546Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2595851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2595960Z self.run() 2022-11-23T02:48:20.2596147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2596299Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2596663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2596803Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2597176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2597305Z getattr(self, test_name)() 2022-11-23T02:48:20.2597668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2597771Z fn() 2022-11-23T02:48:20.2598128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2598261Z test(self, **param_kwargs) 2022-11-23T02:48:20.2598627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2598757Z return func(*args, **kwargs) 2022-11-23T02:48:20.2599072Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2599199Z self.run_subtests( 2022-11-23T02:48:20.2599562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2599731Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2600087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2600243Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2600638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2600763Z output = model(*input) 2022-11-23T02:48:20.2601096Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2601242Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2601630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2601813Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2602170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2602295Z _lazy_init(state, module) 2022-11-23T02:48:20.2602652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2602804Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2603150Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2603278Z return func(*args, **kwargs) 2022-11-23T02:48:20.2603664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2603781Z p_assert( 2022-11-23T02:48:20.2604112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2604244Z traceback.print_stack() 2022-11-23T02:48:20.2604654Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.2604789Z File "", line 1, in 2022-11-23T02:48:20.2605004Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2605148Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2605425Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2605582Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2605784Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2605893Z self.run() 2022-11-23T02:48:20.2606103Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2606255Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2606605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2606743Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2607111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2607239Z getattr(self, test_name)() 2022-11-23T02:48:20.2607587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2607694Z fn() 2022-11-23T02:48:20.2608065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2608193Z test(self, **param_kwargs) 2022-11-23T02:48:20.2608604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2608745Z return func(*args, **kwargs) 2022-11-23T02:48:20.2608999Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2609097Z self.run_subtests( 2022-11-23T02:48:20.2609461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2609626Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2609997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2610165Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2610551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2610674Z output = model(*input) 2022-11-23T02:48:20.2611012Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2611159Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2611531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2611714Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2612091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2612218Z _lazy_init(state, module) 2022-11-23T02:48:20.2612580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2612728Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2613072Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2613201Z return func(*args, **kwargs) 2022-11-23T02:48:20.2613574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2613684Z p_assert( 2022-11-23T02:48:20.2614030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2614160Z traceback.print_stack() 2022-11-23T02:48:20.2614414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.2614661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.2615149Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2615283Z File "", line 1, in 2022-11-23T02:48:20.2615483Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2615634Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2615841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2615996Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2616213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2616321Z self.run() 2022-11-23T02:48:20.2616528Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2616660Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2617014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2617154Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2617526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2617653Z getattr(self, test_name)() 2022-11-23T02:48:20.2618072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2618180Z fn() 2022-11-23T02:48:20.2618557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2618666Z test(self, **param_kwargs) 2022-11-23T02:48:20.2619029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2619158Z return func(*args, **kwargs) 2022-11-23T02:48:20.2619413Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2619537Z self.run_subtests( 2022-11-23T02:48:20.2619902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2620072Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2620452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2620594Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2620978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2621104Z output = model(*input) 2022-11-23T02:48:20.2621437Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2621583Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2621972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2622156Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2622531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2622639Z _lazy_init(state, module) 2022-11-23T02:48:20.2623002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2623150Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2623496Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2623623Z return func(*args, **kwargs) 2022-11-23T02:48:20.2624013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2624182Z p_assert( 2022-11-23T02:48:20.2624529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2624641Z traceback.print_stack() 2022-11-23T02:48:20.2625050Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.2625184Z File "", line 1, in 2022-11-23T02:48:20.2625404Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2625551Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2625756Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2625912Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2626131Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2626220Z self.run() 2022-11-23T02:48:20.2626428Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2626581Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2626932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2627072Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2627541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2627679Z getattr(self, test_name)() 2022-11-23T02:48:20.2628048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2628131Z fn() 2022-11-23T02:48:20.2628502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2628629Z test(self, **param_kwargs) 2022-11-23T02:48:20.2628989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2629126Z return func(*args, **kwargs) 2022-11-23T02:48:20.2629383Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2629500Z self.run_subtests( 2022-11-23T02:48:20.2629847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2630015Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2630387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2630543Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2630924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2631047Z output = model(*input) 2022-11-23T02:48:20.2631381Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2631532Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2631920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2632084Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2632461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2632589Z _lazy_init(state, module) 2022-11-23T02:48:20.2632949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2633096Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2633447Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2633579Z return func(*args, **kwargs) 2022-11-23T02:48:20.2634030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2634119Z p_assert( 2022-11-23T02:48:20.2634465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2634595Z traceback.print_stack() 2022-11-23T02:48:20.2634852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.2635312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.2635736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2636141Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.2636385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.2636612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.2637015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2637493Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.2637754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.2637994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.2638396Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2638793Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.2639049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.2639294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.2639705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2640088Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.2640852Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2641103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.2641347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.2641754Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2642162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.2642927Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2643666Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2643996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.2644236Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.2644644Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2645049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.2645278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.2645518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.2645917Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2646317Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.2646564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.2646854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.2647269Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2647663Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.2647907Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.2648128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.2648540Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2648942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.2649193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.2649438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.2649840Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2650238Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.2650483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.2650728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.2651112Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2651511Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.2652271Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2653015Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2653326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.2653568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.2653978Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2654381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.2654627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:48:20.2654867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:48:20.2655265Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.2655665Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.2655895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:48:20.2656182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:48:20.2656596Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.2656991Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.2657238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:48:20.2657479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:48:20.2657886Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.2658289Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.2658535Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:48:20.2658756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:48:20.2659151Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.2659547Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.2660053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:48:20.2660302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:48:20.2660705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.2661104Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.2661859Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2662601Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2662918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:48:20.2663159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:48:20.2663552Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.2663954Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.2664201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:48:20.2664446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:48:20.2664845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.2665598Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2666042Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.2666803Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2667045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:48:20.2667294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:48:20.2667697Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.2668084Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.2668332Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:48:20.2668569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:48:20.2668973Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.2669372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.2669624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:48:20.2669867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:48:20.2670268Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.2670669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.2670918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:48:20.2671139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:48:20.2671540Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.2671939Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.2672754Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2673491Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.2673782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:48:20.2674027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:48:20.2674437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.2674837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.2674957Z dist init r=1, world=2 2022-11-23T02:48:20.2675573Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2675908Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2676230Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2676545Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2676883Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2677205Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2677522Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2677855Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2678175Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2678497Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2678831Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2679150Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.2679251Z dist init r=0, world=2 2022-11-23T02:48:20.2679567Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2679882Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2680278Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2680589Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2680915Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2681232Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2681542Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2681869Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2682187Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2682549Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2682882Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2683175Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.2683283Z ok (6.916s) 2022-11-23T02:48:20.2683651Z test_mixture_of_experts_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87456 2022-11-23T02:48:20.2683881Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87457 2022-11-23T02:48:20.2684282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2684467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2684861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2685060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2685433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.2685594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.2685984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.2686179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.2686431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.2686682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.2687096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2687500Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.2687738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.2687972Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.2689079Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2689181Z warnings.warn( 2022-11-23T02:48:20.2689429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.2690459Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.2690581Z warnings.warn( 2022-11-23T02:48:20.2690832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.2691286Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2691429Z File "", line 1, in 2022-11-23T02:48:20.2691647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2691795Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2692006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2692143Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2692365Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2692480Z self.run() 2022-11-23T02:48:20.2692689Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2692843Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2693196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2693335Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2693693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2693824Z getattr(self, test_name)() 2022-11-23T02:48:20.2694194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2694297Z fn() 2022-11-23T02:48:20.2694672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2694801Z test(self, **param_kwargs) 2022-11-23T02:48:20.2695166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2695298Z return func(*args, **kwargs) 2022-11-23T02:48:20.2695533Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2695652Z self.run_subtests( 2022-11-23T02:48:20.2696021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2696195Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2696570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2696730Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2697115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2697319Z output = model(*input) 2022-11-23T02:48:20.2697644Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2697787Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2698172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2698357Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2698735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2698862Z _lazy_init(state, module) 2022-11-23T02:48:20.2699224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2699373Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2699705Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2699841Z return func(*args, **kwargs) 2022-11-23T02:48:20.2700232Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2700338Z p_assert( 2022-11-23T02:48:20.2700739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2700880Z traceback.print_stack() 2022-11-23T02:48:20.2701290Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.2701423Z File "", line 1, in 2022-11-23T02:48:20.2701625Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2701773Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2701982Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2702144Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2702362Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2702470Z self.run() 2022-11-23T02:48:20.2702673Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2702825Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2703163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2703302Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2703679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2703809Z getattr(self, test_name)() 2022-11-23T02:48:20.2704182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2704285Z fn() 2022-11-23T02:48:20.2704665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2704775Z test(self, **param_kwargs) 2022-11-23T02:48:20.2705144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2705275Z return func(*args, **kwargs) 2022-11-23T02:48:20.2705533Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2705652Z self.run_subtests( 2022-11-23T02:48:20.2706017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2706184Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2706558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2706696Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2707156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2707280Z output = model(*input) 2022-11-23T02:48:20.2707616Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2707766Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2708152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2708335Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2708711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2708838Z _lazy_init(state, module) 2022-11-23T02:48:20.2709185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2709338Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2709685Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2709814Z return func(*args, **kwargs) 2022-11-23T02:48:20.2710250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2710364Z p_assert( 2022-11-23T02:48:20.2710710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2710824Z traceback.print_stack() 2022-11-23T02:48:20.2711076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.2711325Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.2711733Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2711874Z File "", line 1, in 2022-11-23T02:48:20.2712092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2712239Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2712446Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2712587Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2712804Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2712911Z self.run() 2022-11-23T02:48:20.2713116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2713268Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2713618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2713756Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2714137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2714248Z getattr(self, test_name)() 2022-11-23T02:48:20.2714620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2714725Z fn() 2022-11-23T02:48:20.2715336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2715475Z test(self, **param_kwargs) 2022-11-23T02:48:20.2715845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2715973Z return func(*args, **kwargs) 2022-11-23T02:48:20.2716229Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2716328Z self.run_subtests( 2022-11-23T02:48:20.2716790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2716955Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2717329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2717490Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2717875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2718000Z output = model(*input) 2022-11-23T02:48:20.2718334Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2718460Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2718844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2719032Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2719412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2719540Z _lazy_init(state, module) 2022-11-23T02:48:20.2719964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2720123Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2720474Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2720585Z return func(*args, **kwargs) 2022-11-23T02:48:20.2720975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2721082Z p_assert( 2022-11-23T02:48:20.2721429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2721570Z traceback.print_stack() 2022-11-23T02:48:20.2721983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.2722118Z File "", line 1, in 2022-11-23T02:48:20.2722330Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2722463Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2722677Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2722831Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2723047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2723157Z self.run() 2022-11-23T02:48:20.2723363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2723513Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2723867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2723987Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2724358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2724488Z getattr(self, test_name)() 2022-11-23T02:48:20.2724857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2724962Z fn() 2022-11-23T02:48:20.2725337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2725467Z test(self, **param_kwargs) 2022-11-23T02:48:20.2725820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2725949Z return func(*args, **kwargs) 2022-11-23T02:48:20.2726272Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2726392Z self.run_subtests( 2022-11-23T02:48:20.2726756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2726928Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2727305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2727467Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2727856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2727965Z output = model(*input) 2022-11-23T02:48:20.2728304Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2728449Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2728839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2729019Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2729395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2729587Z _lazy_init(state, module) 2022-11-23T02:48:20.2729958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2730088Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2730437Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2730567Z return func(*args, **kwargs) 2022-11-23T02:48:20.2730956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2731069Z p_assert( 2022-11-23T02:48:20.2731414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2731545Z traceback.print_stack() 2022-11-23T02:48:20.2731778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.2732035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.2732446Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2732581Z File "", line 1, in 2022-11-23T02:48:20.2732797Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2732945Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2733153Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2733315Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2733517Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2733625Z self.run() 2022-11-23T02:48:20.2733833Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2733981Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2734339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2734479Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2734854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2734982Z getattr(self, test_name)() 2022-11-23T02:48:20.2735332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2735435Z fn() 2022-11-23T02:48:20.2735877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2736003Z test(self, **param_kwargs) 2022-11-23T02:48:20.2736368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2736498Z return func(*args, **kwargs) 2022-11-23T02:48:20.2736756Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2736876Z self.run_subtests( 2022-11-23T02:48:20.2737221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2737386Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2737761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2737919Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2738309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2738435Z output = model(*input) 2022-11-23T02:48:20.2738772Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2738976Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2739359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2739540Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2739915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2740041Z _lazy_init(state, module) 2022-11-23T02:48:20.2740400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2740556Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2740906Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2741040Z return func(*args, **kwargs) 2022-11-23T02:48:20.2741413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2741523Z p_assert( 2022-11-23T02:48:20.2741870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2742002Z traceback.print_stack() 2022-11-23T02:48:20.2742410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.2742544Z File "", line 1, in 2022-11-23T02:48:20.2742762Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2742914Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2743104Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2743263Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2743481Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2743592Z self.run() 2022-11-23T02:48:20.2743806Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2743957Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2744310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2744448Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2744804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2744934Z getattr(self, test_name)() 2022-11-23T02:48:20.2745383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2745488Z fn() 2022-11-23T02:48:20.2745861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2745988Z test(self, **param_kwargs) 2022-11-23T02:48:20.2746356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2746470Z return func(*args, **kwargs) 2022-11-23T02:48:20.2746724Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2746842Z self.run_subtests( 2022-11-23T02:48:20.2747207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2747375Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2747753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2747913Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2748298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2748423Z output = model(*input) 2022-11-23T02:48:20.2748789Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2748947Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2749333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2749513Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2749887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2750020Z _lazy_init(state, module) 2022-11-23T02:48:20.2750379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2750527Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2750859Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2750995Z return func(*args, **kwargs) 2022-11-23T02:48:20.2751385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2751492Z p_assert( 2022-11-23T02:48:20.2751833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2751964Z traceback.print_stack() 2022-11-23T02:48:20.2752216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.2752465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.2752861Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2752994Z File "", line 1, in 2022-11-23T02:48:20.2753213Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2753366Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2753578Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2753736Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2753954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2754046Z self.run() 2022-11-23T02:48:20.2754257Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2754408Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2754827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2754964Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2755566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2755694Z getattr(self, test_name)() 2022-11-23T02:48:20.2756075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2756160Z fn() 2022-11-23T02:48:20.2756536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2756663Z test(self, **param_kwargs) 2022-11-23T02:48:20.2757026Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2757156Z return func(*args, **kwargs) 2022-11-23T02:48:20.2757415Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2757533Z self.run_subtests( 2022-11-23T02:48:20.2757895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2758044Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2758492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2758669Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2759058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2759183Z output = model(*input) 2022-11-23T02:48:20.2759516Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2759662Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2760056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2760221Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2760692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2760914Z _lazy_init(state, module) 2022-11-23T02:48:20.2811370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2811586Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2812009Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2812133Z return func(*args, **kwargs) 2022-11-23T02:48:20.2812536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2812650Z p_assert( 2022-11-23T02:48:20.2813013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2813136Z traceback.print_stack() 2022-11-23T02:48:20.2813554Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.2813688Z File "", line 1, in 2022-11-23T02:48:20.2813905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2814046Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2814251Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2814401Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2814618Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2814716Z self.run() 2022-11-23T02:48:20.2815119Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2815263Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2815630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2815764Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2816149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2816272Z getattr(self, test_name)() 2022-11-23T02:48:20.2816655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2816747Z fn() 2022-11-23T02:48:20.2817138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2817259Z test(self, **param_kwargs) 2022-11-23T02:48:20.2817640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2817764Z return func(*args, **kwargs) 2022-11-23T02:48:20.2818023Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2818132Z self.run_subtests( 2022-11-23T02:48:20.2818591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2818770Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2819164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2819317Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2819718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2819833Z output = model(*input) 2022-11-23T02:48:20.2820178Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2820319Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2820722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2820903Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2821299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2821418Z _lazy_init(state, module) 2022-11-23T02:48:20.2821793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2821935Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2822291Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2822414Z return func(*args, **kwargs) 2022-11-23T02:48:20.2822827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2822923Z p_assert( 2022-11-23T02:48:20.2823281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2823405Z traceback.print_stack() 2022-11-23T02:48:20.2823657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.2823908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.2824320Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2824447Z File "", line 1, in 2022-11-23T02:48:20.2824663Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2824882Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2825089Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2825238Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2825456Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2825547Z self.run() 2022-11-23T02:48:20.2825763Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2825909Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2826273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2826403Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2826786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2826907Z getattr(self, test_name)() 2022-11-23T02:48:20.2827289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2827381Z fn() 2022-11-23T02:48:20.2827773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2827891Z test(self, **param_kwargs) 2022-11-23T02:48:20.2828329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2828461Z return func(*args, **kwargs) 2022-11-23T02:48:20.2828720Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2828827Z self.run_subtests( 2022-11-23T02:48:20.2829206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2829362Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2829756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2829912Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2830315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2830432Z output = model(*input) 2022-11-23T02:48:20.2830783Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2830923Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2831323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2831495Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2831889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2832013Z _lazy_init(state, module) 2022-11-23T02:48:20.2832385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2832526Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2832883Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2833009Z return func(*args, **kwargs) 2022-11-23T02:48:20.2833416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2833509Z p_assert( 2022-11-23T02:48:20.2833866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2833987Z traceback.print_stack() 2022-11-23T02:48:20.2834402Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.2834596Z File "", line 1, in 2022-11-23T02:48:20.2834815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2834955Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2835459Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2835612Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2835837Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2835937Z self.run() 2022-11-23T02:48:20.2836146Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2836291Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2836665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2836794Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2837178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2837299Z getattr(self, test_name)() 2022-11-23T02:48:20.2837681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2837773Z fn() 2022-11-23T02:48:20.2838244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2838379Z test(self, **param_kwargs) 2022-11-23T02:48:20.2838766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2838886Z return func(*args, **kwargs) 2022-11-23T02:48:20.2839137Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2839244Z self.run_subtests( 2022-11-23T02:48:20.2839619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2839790Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2840178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2840331Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2840737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2840854Z output = model(*input) 2022-11-23T02:48:20.2841203Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2841337Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2841735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2841914Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2842309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2842428Z _lazy_init(state, module) 2022-11-23T02:48:20.2842806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2842947Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2843312Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2843429Z return func(*args, **kwargs) 2022-11-23T02:48:20.2843832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2843931Z p_assert( 2022-11-23T02:48:20.2844289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2844413Z traceback.print_stack() 2022-11-23T02:48:20.2844774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.2845023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.2845450Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2845576Z File "", line 1, in 2022-11-23T02:48:20.2845791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2845931Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2846138Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2846286Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2846503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2846600Z self.run() 2022-11-23T02:48:20.2846809Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2846956Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2847323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2847453Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2847895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2848028Z getattr(self, test_name)() 2022-11-23T02:48:20.2848414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2848505Z fn() 2022-11-23T02:48:20.2848887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2849007Z test(self, **param_kwargs) 2022-11-23T02:48:20.2849387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2849515Z return func(*args, **kwargs) 2022-11-23T02:48:20.2849772Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2849879Z self.run_subtests( 2022-11-23T02:48:20.2850260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2850427Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2850808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2850962Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2851362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2851478Z output = model(*input) 2022-11-23T02:48:20.2851833Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2851973Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2852375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2852553Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2852941Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2853062Z _lazy_init(state, module) 2022-11-23T02:48:20.2853434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2853576Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2853937Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2854059Z return func(*args, **kwargs) 2022-11-23T02:48:20.2854548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2854646Z p_assert( 2022-11-23T02:48:20.2854997Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2855121Z traceback.print_stack() 2022-11-23T02:48:20.2855543Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.2855674Z File "", line 1, in 2022-11-23T02:48:20.2855892Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2856034Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2856241Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2856391Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2856608Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2856706Z self.run() 2022-11-23T02:48:20.2856917Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2857065Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2857487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2857629Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2858022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2858458Z getattr(self, test_name)() 2022-11-23T02:48:20.2859011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2859396Z fn() 2022-11-23T02:48:20.2859902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2860321Z test(self, **param_kwargs) 2022-11-23T02:48:20.2860857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2861264Z return func(*args, **kwargs) 2022-11-23T02:48:20.2861680Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2862069Z self.run_subtests( 2022-11-23T02:48:20.2862589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2863035Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2863615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2864060Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2864648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2865060Z output = model(*input) 2022-11-23T02:48:20.2865566Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2865975Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2866543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2867023Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2867615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2868022Z _lazy_init(state, module) 2022-11-23T02:48:20.2868544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2869009Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2869620Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2870014Z return func(*args, **kwargs) 2022-11-23T02:48:20.2870572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2870970Z p_assert( 2022-11-23T02:48:20.2871464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2871865Z traceback.print_stack() 2022-11-23T02:48:20.2872278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.2872793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.2873485Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2873975Z File "", line 1, in 2022-11-23T02:48:20.2874363Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2874758Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2875389Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2875784Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2876271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2876628Z self.run() 2022-11-23T02:48:20.2876963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2877340Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2877885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2878295Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2878851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2879291Z getattr(self, test_name)() 2022-11-23T02:48:20.2879852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2880238Z fn() 2022-11-23T02:48:20.2880773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2881202Z test(self, **param_kwargs) 2022-11-23T02:48:20.2881745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2882170Z return func(*args, **kwargs) 2022-11-23T02:48:20.2882602Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2883007Z self.run_subtests( 2022-11-23T02:48:20.2883533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2884003Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2884633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2885098Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2885708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2886143Z output = model(*input) 2022-11-23T02:48:20.2886676Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2887098Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2887712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2888213Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2888922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2889354Z _lazy_init(state, module) 2022-11-23T02:48:20.2889882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2890321Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2890886Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2891286Z return func(*args, **kwargs) 2022-11-23T02:48:20.2891875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2892292Z p_assert( 2022-11-23T02:48:20.2892787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2893202Z traceback.print_stack() 2022-11-23T02:48:20.2893804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.2894267Z File "", line 1, in 2022-11-23T02:48:20.2894650Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2895055Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2895520Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2895945Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2896375Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2896746Z self.run() 2022-11-23T02:48:20.2897109Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2897495Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2898058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2898488Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2899043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2899473Z getattr(self, test_name)() 2022-11-23T02:48:20.2900032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2900431Z fn() 2022-11-23T02:48:20.2900946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2901373Z test(self, **param_kwargs) 2022-11-23T02:48:20.2901931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2902343Z return func(*args, **kwargs) 2022-11-23T02:48:20.2902783Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2903194Z self.run_subtests( 2022-11-23T02:48:20.2903764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2904242Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2904870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2905341Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2905976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2906391Z output = model(*input) 2022-11-23T02:48:20.2906929Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2907375Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2907988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2908545Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2909175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2909611Z _lazy_init(state, module) 2022-11-23T02:48:20.2910191Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2910931Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2911495Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2911894Z return func(*args, **kwargs) 2022-11-23T02:48:20.2912427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2912827Z p_assert( 2022-11-23T02:48:20.2913312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2913719Z traceback.print_stack() 2022-11-23T02:48:20.2914115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.2914632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.2915823Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2916262Z File "", line 1, in 2022-11-23T02:48:20.2916650Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2917038Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2917404Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2917788Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2918197Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2918546Z self.run() 2022-11-23T02:48:20.2918871Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2919299Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2919843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2920226Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2920771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2921176Z getattr(self, test_name)() 2022-11-23T02:48:20.2921709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2922070Z fn() 2022-11-23T02:48:20.2922575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2922986Z test(self, **param_kwargs) 2022-11-23T02:48:20.2923497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2923907Z return func(*args, **kwargs) 2022-11-23T02:48:20.2924324Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2924689Z self.run_subtests( 2022-11-23T02:48:20.2925212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2925647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2926213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2926626Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2927198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2927699Z output = model(*input) 2022-11-23T02:48:20.2928179Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2928577Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2929147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2929624Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2930189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2930596Z _lazy_init(state, module) 2022-11-23T02:48:20.2931126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2931527Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2932061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2932456Z return func(*args, **kwargs) 2022-11-23T02:48:20.2933010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2933386Z p_assert( 2022-11-23T02:48:20.2933924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2934325Z traceback.print_stack() 2022-11-23T02:48:20.2934882Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.2935321Z File "", line 1, in 2022-11-23T02:48:20.2935707Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2936093Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2936457Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2936850Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2937252Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2937580Z self.run() 2022-11-23T02:48:20.2937926Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2938309Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2938821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2939227Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2939769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2940176Z getattr(self, test_name)() 2022-11-23T02:48:20.2940687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2941074Z fn() 2022-11-23T02:48:20.2941583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2941969Z test(self, **param_kwargs) 2022-11-23T02:48:20.2942498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2942908Z return func(*args, **kwargs) 2022-11-23T02:48:20.2943302Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2943685Z self.run_subtests( 2022-11-23T02:48:20.2944200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2944636Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2945183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2945696Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2946270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2946662Z output = model(*input) 2022-11-23T02:48:20.2947163Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2947565Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2948131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2948579Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2949157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2949565Z _lazy_init(state, module) 2022-11-23T02:48:20.2950065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2950491Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2951021Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2951417Z return func(*args, **kwargs) 2022-11-23T02:48:20.2952007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2952414Z p_assert( 2022-11-23T02:48:20.2952902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2953278Z traceback.print_stack() 2022-11-23T02:48:20.2953696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.2954207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.2954890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2956061Z File "", line 1, in 2022-11-23T02:48:20.2956456Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2957128Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2957509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2957895Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2958298Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2958628Z self.run() 2022-11-23T02:48:20.2958976Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2959353Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2959906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2960293Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2960838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2961245Z getattr(self, test_name)() 2022-11-23T02:48:20.2961759Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2962140Z fn() 2022-11-23T02:48:20.2962644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2963052Z test(self, **param_kwargs) 2022-11-23T02:48:20.2963560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2963962Z return func(*args, **kwargs) 2022-11-23T02:48:20.2964374Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2964900Z self.run_subtests( 2022-11-23T02:48:20.2965415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2965852Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2966397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2966832Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2967718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2968130Z output = model(*input) 2022-11-23T02:48:20.2968604Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2969005Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2969566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2970047Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2970608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2971012Z _lazy_init(state, module) 2022-11-23T02:48:20.2971623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2972040Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2972568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2972964Z return func(*args, **kwargs) 2022-11-23T02:48:20.2973491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2973933Z p_assert( 2022-11-23T02:48:20.2974418Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2974818Z traceback.print_stack() 2022-11-23T02:48:20.2975370Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.2975810Z File "", line 1, in 2022-11-23T02:48:20.2976195Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2976563Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2976947Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2977329Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2977730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2978061Z self.run() 2022-11-23T02:48:20.2978406Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2978789Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.2979304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.2979703Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.2980242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.2980629Z getattr(self, test_name)() 2022-11-23T02:48:20.2981684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.2982118Z fn() 2022-11-23T02:48:20.2982903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.2983296Z test(self, **param_kwargs) 2022-11-23T02:48:20.2983826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.2984337Z return func(*args, **kwargs) 2022-11-23T02:48:20.2984729Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.2985113Z self.run_subtests( 2022-11-23T02:48:20.2985628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.2986071Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.2986619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.2987051Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.2987616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.2988003Z output = model(*input) 2022-11-23T02:48:20.2988497Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.2988903Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.2989447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.2989915Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.2990556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.2990974Z _lazy_init(state, module) 2022-11-23T02:48:20.2991481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.2991900Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.2992430Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.2992827Z return func(*args, **kwargs) 2022-11-23T02:48:20.2993357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.2993755Z p_assert( 2022-11-23T02:48:20.2994239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.2994613Z traceback.print_stack() 2022-11-23T02:48:20.2995226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.2995757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.2996422Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.2996862Z File "", line 1, in 2022-11-23T02:48:20.2997247Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.2997630Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.2998000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.2998384Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.2998786Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.2999114Z self.run() 2022-11-23T02:48:20.2999457Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.2999840Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3000354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3000752Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3001294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3001697Z getattr(self, test_name)() 2022-11-23T02:48:20.3002206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3002700Z fn() 2022-11-23T02:48:20.3003213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3003599Z test(self, **param_kwargs) 2022-11-23T02:48:20.3004127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3004533Z return func(*args, **kwargs) 2022-11-23T02:48:20.3004948Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.3005317Z self.run_subtests( 2022-11-23T02:48:20.3005838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3006276Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3006819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3007259Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3007830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3008239Z output = model(*input) 2022-11-23T02:48:20.3008785Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3009198Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3009769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3010222Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3010802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3011210Z _lazy_init(state, module) 2022-11-23T02:48:20.3011731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3012134Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3012662Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3013056Z return func(*args, **kwargs) 2022-11-23T02:48:20.3013598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3013997Z p_assert( 2022-11-23T02:48:20.3014486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3014878Z traceback.print_stack() 2022-11-23T02:48:20.3015435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3015872Z File "", line 1, in 2022-11-23T02:48:20.3016257Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3016626Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3017006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3017390Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3017777Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3018127Z self.run() 2022-11-23T02:48:20.3018472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3018851Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3019360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3019760Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3020303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3020759Z getattr(self, test_name)() 2022-11-23T02:48:20.3021294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3021669Z fn() 2022-11-23T02:48:20.3022155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3022565Z test(self, **param_kwargs) 2022-11-23T02:48:20.3023094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3023494Z return func(*args, **kwargs) 2022-11-23T02:48:20.3023887Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.3024271Z self.run_subtests( 2022-11-23T02:48:20.3024784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3025204Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3025767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3026203Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3026775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3027216Z output = model(*input) 2022-11-23T02:48:20.3027719Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3028115Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3028657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3029130Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3029710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3030126Z _lazy_init(state, module) 2022-11-23T02:48:20.3030629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3031048Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3031579Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3031956Z return func(*args, **kwargs) 2022-11-23T02:48:20.3032511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3032905Z p_assert( 2022-11-23T02:48:20.3033391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3033761Z traceback.print_stack() 2022-11-23T02:48:20.3034173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.3034696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.3035564Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3036008Z File "", line 1, in 2022-11-23T02:48:20.3036396Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3036785Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3037149Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3037539Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3037938Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3038269Z self.run() 2022-11-23T02:48:20.3038614Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3039092Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3039607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3040007Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3040547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3040956Z getattr(self, test_name)() 2022-11-23T02:48:20.3041473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3041854Z fn() 2022-11-23T02:48:20.3042361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3042748Z test(self, **param_kwargs) 2022-11-23T02:48:20.3043268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3043676Z return func(*args, **kwargs) 2022-11-23T02:48:20.3044072Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.3044456Z self.run_subtests( 2022-11-23T02:48:20.3044968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3045531Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3046097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3046531Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3047101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3047489Z output = model(*input) 2022-11-23T02:48:20.3047982Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3048383Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3048950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3049398Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3049987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3050394Z _lazy_init(state, module) 2022-11-23T02:48:20.3050900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3051596Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3052134Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3052533Z return func(*args, **kwargs) 2022-11-23T02:48:20.3053068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3053471Z p_assert( 2022-11-23T02:48:20.3053961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3054336Z traceback.print_stack() 2022-11-23T02:48:20.3054917Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3055360Z File "", line 1, in 2022-11-23T02:48:20.3055745Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3056108Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3056492Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3056877Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3057260Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3057699Z self.run() 2022-11-23T02:48:20.3058045Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3058401Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3058936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3059343Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3059887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3060272Z getattr(self, test_name)() 2022-11-23T02:48:20.3060812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3061187Z fn() 2022-11-23T02:48:20.3061673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3062078Z test(self, **param_kwargs) 2022-11-23T02:48:20.3062613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3063017Z return func(*args, **kwargs) 2022-11-23T02:48:20.3063411Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.3063794Z self.run_subtests( 2022-11-23T02:48:20.3064722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3065266Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3066023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3066457Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3067011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3067429Z output = model(*input) 2022-11-23T02:48:20.3067920Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3068322Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3068862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3069342Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3069929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3070332Z _lazy_init(state, module) 2022-11-23T02:48:20.3070834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3071255Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3071783Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3072166Z return func(*args, **kwargs) 2022-11-23T02:48:20.3072778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3073178Z p_assert( 2022-11-23T02:48:20.3073651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3074079Z traceback.print_stack() 2022-11-23T02:48:20.3074496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.3075014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.3075892Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3076614Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3077172Z File "", line 1, in 2022-11-23T02:48:20.3077559Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3077928Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3078315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3078702Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3079087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3079438Z self.run() 2022-11-23T02:48:20.3079791Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3080155Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3080693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3081096Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3081643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3082028Z getattr(self, test_name)() 2022-11-23T02:48:20.3082569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3082949Z fn() 2022-11-23T02:48:20.3083504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3083925Z test(self, **param_kwargs) 2022-11-23T02:48:20.3084455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3084862Z return func(*args, **kwargs) 2022-11-23T02:48:20.3085249Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.3085634Z self.run_subtests( 2022-11-23T02:48:20.3086150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3086565Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3087131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3087571Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3088147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3088535Z output = model(*input) 2022-11-23T02:48:20.3089027Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3089427Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3089968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3090445Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3091028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3091432Z _lazy_init(state, module) 2022-11-23T02:48:20.3091934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3092356Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3092884Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3093262Z return func(*args, **kwargs) 2022-11-23T02:48:20.3093813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3093925Z p_assert( 2022-11-23T02:48:20.3094271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3094498Z traceback.print_stack() 2022-11-23T02:48:20.3094631Z File "", line 1, in 2022-11-23T02:48:20.3094828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3094974Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3095188Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3095346Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3095565Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3095670Z self.run() 2022-11-23T02:48:20.3095875Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3096007Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3096368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3096504Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3096885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3097010Z getattr(self, test_name)() 2022-11-23T02:48:20.3097375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3097475Z fn() 2022-11-23T02:48:20.3097907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3098028Z test(self, **param_kwargs) 2022-11-23T02:48:20.3098402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3098537Z return func(*args, **kwargs) 2022-11-23T02:48:20.3098791Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:48:20.3098916Z self.run_subtests( 2022-11-23T02:48:20.3099285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3099452Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3099828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3099990Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3100358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3100489Z output = model(*input) 2022-11-23T02:48:20.3100828Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3100971Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3101355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3101541Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3101923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3102048Z _lazy_init(state, module) 2022-11-23T02:48:20.3102394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3102549Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3102893Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3103022Z return func(*args, **kwargs) 2022-11-23T02:48:20.3103411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3103518Z p_assert( 2022-11-23T02:48:20.3103861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3104057Z traceback.print_stack() 2022-11-23T02:48:20.3104295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.3104539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.3104957Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3105374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3105620Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.3105857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.3106264Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3106676Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3106921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.3107188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.3107609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3108011Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3108264Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.3108501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.3108905Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3109307Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3110078Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3110335Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.3110575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.3110959Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3111374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3112144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3112898Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3113147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.3113470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.3113880Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3114279Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3114537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.3114784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.3115487Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3115908Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3116136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.3116375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.3116778Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3117259Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3117519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.3117756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.3118162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3118562Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3118814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.3119035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.3119443Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3119846Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3120085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.3120328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.3120726Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3121135Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3121908Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3122669Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3122915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.3123157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.3123644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3124025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3124280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:48:20.3124518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:48:20.3124925Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.3125322Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.3125571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:48:20.3125812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:48:20.3126213Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.3126656Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.3126899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:48:20.3127150Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:48:20.3127551Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.3127949Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.3128199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:48:20.3128436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:48:20.3128843Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.3129250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.3129493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:48:20.3129736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:48:20.3130112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.3130528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.3131290Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3132049Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3132304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:48:20.3132543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:48:20.3133016Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.3133418Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.3133672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:48:20.3133912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:48:20.3134317Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.3135077Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3135482Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.3136259Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3136522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:48:20.3136758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:48:20.3137159Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.3137570Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.3137813Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:48:20.3138052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:48:20.3138457Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.3138863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.3139105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:48:20.3139326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:48:20.3139727Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.3140137Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.3140383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:48:20.3140627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:48:20.3141024Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.3141421Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.3142174Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3142997Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3143256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:48:20.3143495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:48:20.3143897Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.3144282Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.3144406Z dist init r=1, world=2 2022-11-23T02:48:20.3144743Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3145127Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3145452Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3145766Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3146072Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3146390Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3146696Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3147010Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3147317Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3147605Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3147920Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3148225Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3148342Z dist init r=0, world=2 2022-11-23T02:48:20.3148681Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3148999Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3149313Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3149628Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3149998Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3150305Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3150616Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3150925Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3151212Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3151523Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3151872Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3152193Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3152298Z ok (6.916s) 2022-11-23T02:48:20.3152670Z test_mixture_of_experts_with_delay_before_free_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87803 2022-11-23T02:48:20.3152893Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87804 2022-11-23T02:48:20.3153294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3153477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3153868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3154053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3154435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3154617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3155006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3155385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3155637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.3155897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.3156311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3156721Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3156941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.3157174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.3158208Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3158437Z warnings.warn( 2022-11-23T02:48:20.3158689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.3159714Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3159830Z warnings.warn( 2022-11-23T02:48:20.3160075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.3160483Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3160882Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3161195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.3161439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.3161837Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3162235Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3162485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.3162733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.3163131Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3163535Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3164299Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3165056Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3165314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.3165557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.3165963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3166343Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3166590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.3166835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.3167226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3167690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3167935Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.3168187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.3168574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3168967Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3169191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.3169432Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.3169833Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3170225Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3171321Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.3171541Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.3172565Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.3172779Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:48:20.3173027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.3173279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.3173678Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3174125Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3174443Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.3174664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.3175067Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3175476Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3175725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.3175970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.3176378Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3176788Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3177104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.3177343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.3177730Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3178138Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3178901Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3179161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.3179403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.3179805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3180262Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3180524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.3180772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.3181183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3182217Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:466: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:48:20.3182530Z p.detach().reshape(-1) if isinstance(p, nn.Parameter) else p.reshape(-1) 2022-11-23T02:48:20.3182921Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3183680Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3184443Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3185202Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3185460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.3185707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.3186114Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3186621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3186871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.3187118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.3187526Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3187926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3188674Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3188928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.3189148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.3189605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3190022Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3190275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.3190519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.3190925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3191337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3191583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.3191827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.3192234Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3192615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3192866Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.3193106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.3193513Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3193917Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3194685Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3194939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.3195386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.3195797Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3196301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3196550Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.3196780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.3197187Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3197593Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3197839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.3198078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.3198489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3198894Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3199201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.3199460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.3199843Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3200240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3200999Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3201263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.3201509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.3201913Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3202314Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3202435Z dist init r=1, world=2 2022-11-23T02:48:20.3202547Z dist init r=0, world=2 2022-11-23T02:48:20.3202651Z ok (28.849s) 2022-11-23T02:48:20.3202994Z test_mixture_of_experts_with_delay_before_free_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88126 2022-11-23T02:48:20.3203230Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88127 2022-11-23T02:48:20.3203614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3203801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3204196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3204396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3204776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3204957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3205347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3205589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3205843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.3206099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.3206516Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3206922Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3207156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.3207389Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.3208465Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3208597Z warnings.warn( 2022-11-23T02:48:20.3209629Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3209744Z warnings.warn( 2022-11-23T02:48:20.3209980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.3210234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.3210639Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3211057Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3211311Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.3211556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.3211954Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3212358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3212610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.3212860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.3213244Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3213650Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3214415Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3215289Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3215545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.3215794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.3216194Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3216595Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3216839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.3217091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.3217489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3217870Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3218174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.3218431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.3218824Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3219218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3219973Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3220723Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3220971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.3221215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.3221612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3222015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3222263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.3222496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.3222895Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3223292Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3223538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.3223779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.3224262Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3224665Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3225422Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3226158Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3226409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.3226656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.3227042Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3227490Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3227752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.3227992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.3228396Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3228795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3229044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.3229283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.3229687Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3230073Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3230831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3231083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.3231329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.3231730Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3232134Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3232892Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3233637Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3233954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.3234193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.3234599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3235005Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3235999Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3236239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.3236480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.3236996Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3237419Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3237667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.3237908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.3238308Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3238722Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3238975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.3239213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.3239880Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3240313Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3241078Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3241338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.3241582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.3241990Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3242392Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3242635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.3242875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.3243277Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3243775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3244532Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3244784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.3245023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.3245424Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3245826Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3246078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.3246320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.3246770Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3247184Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3247940Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3248169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.3248414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.3248814Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3249488Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3249765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.3250184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.3250770Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3251185Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3251435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.3251658Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.3252067Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3252473Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3252590Z dist init r=0, world=2 2022-11-23T02:48:20.3252702Z dist init r=1, world=2 2022-11-23T02:48:20.3252805Z ok (28.749s) 2022-11-23T02:48:20.3253179Z test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88449 2022-11-23T02:48:20.3253405Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88450 2022-11-23T02:48:20.3253886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3254051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3254446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3254643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3255020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3255199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3255586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3255781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3256034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.3256266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.3256676Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3257130Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3257376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.3257609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.3258638Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3258762Z warnings.warn( 2022-11-23T02:48:20.3259783Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3259899Z warnings.warn( 2022-11-23T02:48:20.3260148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.3260400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.3260808Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3261192Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3261443Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.3261688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.3262079Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3262473Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3262719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.3263033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.3263429Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3263826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3264588Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3265343Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3265576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.3265822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.3266268Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3266679Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3266923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.3267168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.3267564Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3267960Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3268206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.3268442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.3268839Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3269232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3269983Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3270734Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3270980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.3271225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.3271623Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3272018Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3272327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.3272732Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3272978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.3273358Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3273606Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.3274053Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3274300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.3274709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3275718Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3276496Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3276744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.3276984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.3277392Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3277797Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3278031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.3278434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3278682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.3279083Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3279328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.3279572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.3279974Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3280380Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3281130Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3281379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.3281602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.3282087Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3282486Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3283243Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3283995Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3284245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.3284486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.3284885Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3285331Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3286087Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3286337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.3286584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.3286964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3287368Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3287615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.3287857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.3288260Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3288661Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3288911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.3289148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.3289548Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3289954Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3290693Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3290944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.3291267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.3291671Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3292073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3292317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.3292557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.3292956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3293354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3294111Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3294407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.3294642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.3295045Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3295439Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3295685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.3295928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.3296330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3296730Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3297481Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3297730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.3297972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.3298359Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3298757Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3299005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.3299250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.3299649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3300047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3300291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.3300594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.3300991Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3301374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3301493Z dist init r=0, world=2 2022-11-23T02:48:20.3301603Z dist init r=1, world=2 2022-11-23T02:48:20.3301710Z ok (28.849s) 2022-11-23T02:48:20.3302075Z test_mixture_of_experts_with_delay_before_free_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88772 2022-11-23T02:48:20.3302305Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88773 2022-11-23T02:48:20.3302685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3302873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3303251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3303453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3303882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3304072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3304461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3304657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3304906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.3305158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.3305564Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3305952Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3306191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.3306422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.3307451Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3307576Z warnings.warn( 2022-11-23T02:48:20.3308608Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3308724Z warnings.warn( 2022-11-23T02:48:20.3308978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.3309229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.3309639Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3309837Z File "", line 1, in 2022-11-23T02:48:20.3310042Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3310191Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3310405Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3310562Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3310781Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3310889Z self.run() 2022-11-23T02:48:20.3311096Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3311249Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3311585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3311728Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3312099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3312228Z getattr(self, test_name)() 2022-11-23T02:48:20.3312598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3312749Z fn() 2022-11-23T02:48:20.3313138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3313269Z test(self, **param_kwargs) 2022-11-23T02:48:20.3313621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3313752Z return func(*args, **kwargs) 2022-11-23T02:48:20.3314038Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3314159Z self.run_subtests( 2022-11-23T02:48:20.3314523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3314692Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3315305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3315478Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3315856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3315980Z output = model(*input) 2022-11-23T02:48:20.3316321Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3316464Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3316850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3317038Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3317419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3317545Z _lazy_init(state, module) 2022-11-23T02:48:20.3317891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3318041Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3318385Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3318514Z return func(*args, **kwargs) 2022-11-23T02:48:20.3318901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3319007Z p_assert( 2022-11-23T02:48:20.3319352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3319573Z traceback.print_stack() 2022-11-23T02:48:20.3319967Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3320100Z File "", line 1, in 2022-11-23T02:48:20.3320321Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3320468Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3320674Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3320830Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3321049Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3321139Z self.run() 2022-11-23T02:48:20.3321349Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3321502Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3321853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3321990Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3322360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3322558Z getattr(self, test_name)() 2022-11-23T02:48:20.3322942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3323026Z fn() 2022-11-23T02:48:20.3323397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3323523Z test(self, **param_kwargs) 2022-11-23T02:48:20.3323888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3324019Z return func(*args, **kwargs) 2022-11-23T02:48:20.3324305Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3324425Z self.run_subtests( 2022-11-23T02:48:20.3324788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3324946Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3325323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3325481Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3325869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3325994Z output = model(*input) 2022-11-23T02:48:20.3326330Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3326482Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3326871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3327039Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3327424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3327550Z _lazy_init(state, module) 2022-11-23T02:48:20.3327909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3328057Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3328406Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3328536Z return func(*args, **kwargs) 2022-11-23T02:48:20.3328923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3329079Z p_assert( 2022-11-23T02:48:20.3329430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3329560Z traceback.print_stack() 2022-11-23T02:48:20.3329814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.3330067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.3330474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3330880Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3331013Z File "", line 1, in 2022-11-23T02:48:20.3331229Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3331365Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3331574Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3331732Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3331953Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3332112Z self.run() 2022-11-23T02:48:20.3332333Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3332483Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3332819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3332959Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3333329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3333463Z getattr(self, test_name)() 2022-11-23T02:48:20.3333832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3333934Z fn() 2022-11-23T02:48:20.3334308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3334434Z test(self, **param_kwargs) 2022-11-23T02:48:20.3334786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3334916Z return func(*args, **kwargs) 2022-11-23T02:48:20.3335201Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3335321Z self.run_subtests( 2022-11-23T02:48:20.3335688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3335860Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3336234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3336389Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3336756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3336881Z output = model(*input) 2022-11-23T02:48:20.3337220Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3337365Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3337750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3337932Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3338308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3338498Z _lazy_init(state, module) 2022-11-23T02:48:20.3338847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3338998Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3339346Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3339476Z return func(*args, **kwargs) 2022-11-23T02:48:20.3339864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3339972Z p_assert( 2022-11-23T02:48:20.3340318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3340447Z traceback.print_stack() 2022-11-23T02:48:20.3340560Z File "", line 1, in 2022-11-23T02:48:20.3340778Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3340923Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3341129Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3341284Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3341553Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3341670Z self.run() 2022-11-23T02:48:20.3341881Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3342012Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3342364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3342500Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3342868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3342999Z getattr(self, test_name)() 2022-11-23T02:48:20.3343369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3343468Z fn() 2022-11-23T02:48:20.3343822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3343952Z test(self, **param_kwargs) 2022-11-23T02:48:20.3344316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3344444Z return func(*args, **kwargs) 2022-11-23T02:48:20.3344728Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3344843Z self.run_subtests( 2022-11-23T02:48:20.3345206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3345378Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3345736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3345895Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3346281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3346404Z output = model(*input) 2022-11-23T02:48:20.3346739Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3346881Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3347264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3347446Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3347824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3347999Z _lazy_init(state, module) 2022-11-23T02:48:20.3348364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3348512Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3348859Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3348988Z return func(*args, **kwargs) 2022-11-23T02:48:20.3349374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3349479Z p_assert( 2022-11-23T02:48:20.3349827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3349940Z traceback.print_stack() 2022-11-23T02:48:20.3350190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.3350448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.3350856Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3351307Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3351452Z File "", line 1, in 2022-11-23T02:48:20.3351670Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3351817Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3352008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3352164Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3352385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3352496Z self.run() 2022-11-23T02:48:20.3352702Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3352852Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3353204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3353329Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3353704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3353831Z getattr(self, test_name)() 2022-11-23T02:48:20.3354197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3354301Z fn() 2022-11-23T02:48:20.3354672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3354802Z test(self, **param_kwargs) 2022-11-23T02:48:20.3355397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3355515Z return func(*args, **kwargs) 2022-11-23T02:48:20.3355802Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3355923Z self.run_subtests( 2022-11-23T02:48:20.3356296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3356462Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3356832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3356990Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3357373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3357582Z output = model(*input) 2022-11-23T02:48:20.3357924Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3358070Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3358454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3358636Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3359014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3359141Z _lazy_init(state, module) 2022-11-23T02:48:20.3359832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3359997Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3360342Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3360559Z return func(*args, **kwargs) 2022-11-23T02:48:20.3361167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3361281Z p_assert( 2022-11-23T02:48:20.3361805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3361980Z traceback.print_stack() 2022-11-23T02:48:20.3362113Z File "", line 1, in 2022-11-23T02:48:20.3362333Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3362464Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3362680Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3362835Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3363054Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3363167Z self.run() 2022-11-23T02:48:20.3363376Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3363523Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3363885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3364006Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3364379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3364507Z getattr(self, test_name)() 2022-11-23T02:48:20.3364875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3364976Z fn() 2022-11-23T02:48:20.3365349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3365480Z test(self, **param_kwargs) 2022-11-23T02:48:20.3365845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3365958Z return func(*args, **kwargs) 2022-11-23T02:48:20.3366248Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3366366Z self.run_subtests( 2022-11-23T02:48:20.3366730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3366897Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3367270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3367428Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3367812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3367983Z output = model(*input) 2022-11-23T02:48:20.3368323Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3368467Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3368857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3369037Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3369412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3369536Z _lazy_init(state, module) 2022-11-23T02:48:20.3369897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3370027Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3370380Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3370508Z return func(*args, **kwargs) 2022-11-23T02:48:20.3370897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3371003Z p_assert( 2022-11-23T02:48:20.3371396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3371536Z traceback.print_stack() 2022-11-23T02:48:20.3371789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.3372022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.3372434Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3372907Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3373043Z File "", line 1, in 2022-11-23T02:48:20.3373261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3373407Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3373618Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3373772Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3374009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3374119Z self.run() 2022-11-23T02:48:20.3374326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3374477Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3374828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3374968Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3375338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3375465Z getattr(self, test_name)() 2022-11-23T02:48:20.3375823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3375926Z fn() 2022-11-23T02:48:20.3376296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3376423Z test(self, **param_kwargs) 2022-11-23T02:48:20.3376787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3376916Z return func(*args, **kwargs) 2022-11-23T02:48:20.3377201Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3377366Z self.run_subtests( 2022-11-23T02:48:20.3377735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3377904Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3378276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3378434Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3378822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3378943Z output = model(*input) 2022-11-23T02:48:20.3379278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3379426Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3379795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3379981Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3380360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3380483Z _lazy_init(state, module) 2022-11-23T02:48:20.3380895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3381052Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3381402Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3381528Z return func(*args, **kwargs) 2022-11-23T02:48:20.3381904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3382010Z p_assert( 2022-11-23T02:48:20.3382364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3382495Z traceback.print_stack() 2022-11-23T02:48:20.3382627Z File "", line 1, in 2022-11-23T02:48:20.3382840Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3382983Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3383178Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3383333Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3383552Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3383657Z self.run() 2022-11-23T02:48:20.3383864Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3384013Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3384361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3384502Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3384858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3384987Z getattr(self, test_name)() 2022-11-23T02:48:20.3385356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3385459Z fn() 2022-11-23T02:48:20.3385835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3385964Z test(self, **param_kwargs) 2022-11-23T02:48:20.3386328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3386457Z return func(*args, **kwargs) 2022-11-23T02:48:20.3386726Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3386911Z self.run_subtests( 2022-11-23T02:48:20.3387279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3387447Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3387824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3387985Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3388376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3388500Z output = model(*input) 2022-11-23T02:48:20.3388819Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3388964Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3389353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3389534Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3389909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3390034Z _lazy_init(state, module) 2022-11-23T02:48:20.3390442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3390602Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3390935Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3391064Z return func(*args, **kwargs) 2022-11-23T02:48:20.3391449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3391560Z p_assert( 2022-11-23T02:48:20.3391912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3392042Z traceback.print_stack() 2022-11-23T02:48:20.3392295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.3392553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.3392945Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3393355Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3393489Z File "", line 1, in 2022-11-23T02:48:20.3393704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3393854Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3394068Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3394223Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3394444Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3394534Z self.run() 2022-11-23T02:48:20.3394745Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3394897Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3395497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3395639Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3396017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3396144Z getattr(self, test_name)() 2022-11-23T02:48:20.3396511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3396696Z fn() 2022-11-23T02:48:20.3397082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3397211Z test(self, **param_kwargs) 2022-11-23T02:48:20.3397582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3397713Z return func(*args, **kwargs) 2022-11-23T02:48:20.3397996Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3398112Z self.run_subtests( 2022-11-23T02:48:20.3398475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3398624Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3398996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3399158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3399543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3399669Z output = model(*input) 2022-11-23T02:48:20.3400064Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3400222Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3400614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3400777Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3401152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3401276Z _lazy_init(state, module) 2022-11-23T02:48:20.3401642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3401790Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3402135Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3402263Z return func(*args, **kwargs) 2022-11-23T02:48:20.3402652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3402741Z p_assert( 2022-11-23T02:48:20.3403085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3403216Z traceback.print_stack() 2022-11-23T02:48:20.3403347Z File "", line 1, in 2022-11-23T02:48:20.3403563Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3403714Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3403922Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3404059Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3404278Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3404384Z self.run() 2022-11-23T02:48:20.3404597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3404748Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3405099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3405236Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3405609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3405718Z getattr(self, test_name)() 2022-11-23T02:48:20.3406086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3406250Z fn() 2022-11-23T02:48:20.3406630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3406756Z test(self, **param_kwargs) 2022-11-23T02:48:20.3407125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3407253Z return func(*args, **kwargs) 2022-11-23T02:48:20.3407537Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3407636Z self.run_subtests( 2022-11-23T02:48:20.3407999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3408167Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3408542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3408701Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3409085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3409208Z output = model(*input) 2022-11-23T02:48:20.3409597Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3409735Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3410124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3410306Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3410683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3410810Z _lazy_init(state, module) 2022-11-23T02:48:20.3411171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3411318Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3411663Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3411779Z return func(*args, **kwargs) 2022-11-23T02:48:20.3412169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3412276Z p_assert( 2022-11-23T02:48:20.3412621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3412753Z traceback.print_stack() 2022-11-23T02:48:20.3413009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.3413257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.3413672Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3414060Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3414201Z File "", line 1, in 2022-11-23T02:48:20.3414415Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3414560Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3414769Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3414933Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3415134Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3415242Z self.run() 2022-11-23T02:48:20.3415451Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3415695Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3416049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3416186Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3416562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3416691Z getattr(self, test_name)() 2022-11-23T02:48:20.3417046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3417151Z fn() 2022-11-23T02:48:20.3417524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3417654Z test(self, **param_kwargs) 2022-11-23T02:48:20.3418024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3418159Z return func(*args, **kwargs) 2022-11-23T02:48:20.3418448Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3418567Z self.run_subtests( 2022-11-23T02:48:20.3418973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3419161Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3419535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3419693Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3420079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3420203Z output = model(*input) 2022-11-23T02:48:20.3420544Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3420693Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3421062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3421249Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3421628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3421754Z _lazy_init(state, module) 2022-11-23T02:48:20.3422113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3422262Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3422608Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3422742Z return func(*args, **kwargs) 2022-11-23T02:48:20.3423116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3423226Z p_assert( 2022-11-23T02:48:20.3423575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3423705Z traceback.print_stack() 2022-11-23T02:48:20.3423841Z File "", line 1, in 2022-11-23T02:48:20.3424056Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3424201Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3424408Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3424546Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3424762Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3424869Z self.run() 2022-11-23T02:48:20.3425157Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3425307Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3425658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3425795Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3426151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3426280Z getattr(self, test_name)() 2022-11-23T02:48:20.3426648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3426750Z fn() 2022-11-23T02:48:20.3427121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3427246Z test(self, **param_kwargs) 2022-11-23T02:48:20.3427611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3427746Z return func(*args, **kwargs) 2022-11-23T02:48:20.3428014Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3428131Z self.run_subtests( 2022-11-23T02:48:20.3428545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3428722Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3429096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3429255Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3429643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3429773Z output = model(*input) 2022-11-23T02:48:20.3430090Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3430235Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3430618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3430801Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3431178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3431301Z _lazy_init(state, module) 2022-11-23T02:48:20.3431660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3431810Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3432139Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3432275Z return func(*args, **kwargs) 2022-11-23T02:48:20.3432664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3432771Z p_assert( 2022-11-23T02:48:20.3433113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3433246Z traceback.print_stack() 2022-11-23T02:48:20.3433498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.3433744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.3434137Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3434546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3434746Z File "", line 1, in 2022-11-23T02:48:20.3434965Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3435330Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3435543Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3435702Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3435923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3436012Z self.run() 2022-11-23T02:48:20.3436144Z File "", line 1, in 2022-11-23T02:48:20.3436357Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3436508Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3436866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3437009Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3437225Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3437371Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3437726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3437854Z getattr(self, test_name)() 2022-11-23T02:48:20.3438140Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3438308Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3438676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3438778Z fn() 2022-11-23T02:48:20.3438998Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3439087Z self.run() 2022-11-23T02:48:20.3439466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3439597Z test(self, **param_kwargs) 2022-11-23T02:48:20.3439813Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3439962Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3440335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3440467Z return func(*args, **kwargs) 2022-11-23T02:48:20.3440815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3440935Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3441221Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3441338Z self.run_subtests( 2022-11-23T02:48:20.3441711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3441843Z getattr(self, test_name)() 2022-11-23T02:48:20.3442205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3442371Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3442739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3442823Z fn() 2022-11-23T02:48:20.3443196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3443351Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3443719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3443848Z test(self, **param_kwargs) 2022-11-23T02:48:20.3444237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3444435Z output = model(*input) 2022-11-23T02:48:20.3444810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3444920Z return func(*args, **kwargs) 2022-11-23T02:48:20.3445257Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3445406Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3445690Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3445806Z self.run_subtests( 2022-11-23T02:48:20.3446192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3446373Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3446736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3446886Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3447263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3447439Z _lazy_init(state, module) 2022-11-23T02:48:20.3447822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3447978Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3448337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3448484Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3448866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3448997Z output = model(*input) 2022-11-23T02:48:20.3449329Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3449460Z return func(*args, **kwargs) 2022-11-23T02:48:20.3449793Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3449941Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3450333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3450440Z p_assert( 2022-11-23T02:48:20.3450826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3451008Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3451336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3451473Z traceback.print_stack() 2022-11-23T02:48:20.3451850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3451977Z _lazy_init(state, module) 2022-11-23T02:48:20.3452340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3452491Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3452833Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3452961Z return func(*args, **kwargs) 2022-11-23T02:48:20.3453330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3453437Z p_assert( 2022-11-23T02:48:20.3453781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3453978Z traceback.print_stack() 2022-11-23T02:48:20.3454231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.3454483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.3454899Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3455305Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3455424Z File "", line 1, in 2022-11-23T02:48:20.3455642Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3455791Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3455999Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3456161Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3456380Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3456489Z self.run() 2022-11-23T02:48:20.3456678Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3456830Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3457237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3457383Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3457755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3457884Z getattr(self, test_name)() 2022-11-23T02:48:20.3458256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3458357Z fn() 2022-11-23T02:48:20.3458720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3458850Z test(self, **param_kwargs) 2022-11-23T02:48:20.3459215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3459343Z return func(*args, **kwargs) 2022-11-23T02:48:20.3459633Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3459751Z self.run_subtests( 2022-11-23T02:48:20.3460116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3460285Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3460639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3460798Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3461186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3461309Z output = model(*input) 2022-11-23T02:48:20.3461643Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3461792Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3462183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3462367Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3462727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3462855Z _lazy_init(state, module) 2022-11-23T02:48:20.3463218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3463427Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3463776Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3463905Z return func(*args, **kwargs) 2022-11-23T02:48:20.3464298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3464405Z p_assert( 2022-11-23T02:48:20.3464733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3464866Z traceback.print_stack() 2022-11-23T02:48:20.3464998Z File "", line 1, in 2022-11-23T02:48:20.3465214Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3465360Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3465565Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3465726Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3465948Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3466038Z self.run() 2022-11-23T02:48:20.3466247Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3466518Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3466880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3467017Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3467388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3467521Z getattr(self, test_name)() 2022-11-23T02:48:20.3467889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3467980Z fn() 2022-11-23T02:48:20.3468353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3468480Z test(self, **param_kwargs) 2022-11-23T02:48:20.3468847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3468977Z return func(*args, **kwargs) 2022-11-23T02:48:20.3469267Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3469388Z self.run_subtests( 2022-11-23T02:48:20.3469733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3469904Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3470279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3470444Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3470831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3470956Z output = model(*input) 2022-11-23T02:48:20.3471293Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3471443Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3471832Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3471995Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3472375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3472501Z _lazy_init(state, module) 2022-11-23T02:48:20.3472863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3473074Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3473423Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3473551Z return func(*args, **kwargs) 2022-11-23T02:48:20.3473988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3474083Z p_assert( 2022-11-23T02:48:20.3474427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3474557Z traceback.print_stack() 2022-11-23T02:48:20.3474809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.3475265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.3475693Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3476105Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3476241Z File "", line 1, in 2022-11-23T02:48:20.3476522Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3476682Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3476890Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3477044Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3477265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3477373Z self.run() 2022-11-23T02:48:20.3477583Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3477715Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3478076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3478213Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3478586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3478713Z getattr(self, test_name)() 2022-11-23T02:48:20.3479087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3479191Z fn() 2022-11-23T02:48:20.3479564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3479674Z test(self, **param_kwargs) 2022-11-23T02:48:20.3480046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3480178Z return func(*args, **kwargs) 2022-11-23T02:48:20.3480473Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3480591Z self.run_subtests( 2022-11-23T02:48:20.3480955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3481129Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3481505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3481643Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3482030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3482155Z output = model(*input) 2022-11-23T02:48:20.3482491Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3482713Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3483099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3483281Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3483662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3483791Z _lazy_init(state, module) 2022-11-23T02:48:20.3484134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3484283Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3484629Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3484759Z return func(*args, **kwargs) 2022-11-23T02:48:20.3485147Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3485260Z p_assert( 2022-11-23T02:48:20.3485611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3485725Z traceback.print_stack() 2022-11-23T02:48:20.3485858Z File "", line 1, in 2022-11-23T02:48:20.3486125Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3486282Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3486489Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3486641Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3486859Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3486966Z self.run() 2022-11-23T02:48:20.3487156Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3487313Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3487666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3487804Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3488175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3488306Z getattr(self, test_name)() 2022-11-23T02:48:20.3488674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3488780Z fn() 2022-11-23T02:48:20.3489137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3489268Z test(self, **param_kwargs) 2022-11-23T02:48:20.3489637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3489773Z return func(*args, **kwargs) 2022-11-23T02:48:20.3490060Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3490178Z self.run_subtests( 2022-11-23T02:48:20.3490541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3490715Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3491072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3491230Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3491614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3491737Z output = model(*input) 2022-11-23T02:48:20.3492070Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3492298Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3492688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3492869Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3493234Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3493365Z _lazy_init(state, module) 2022-11-23T02:48:20.3493730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3493876Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3494222Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3494350Z return func(*args, **kwargs) 2022-11-23T02:48:20.3494739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3494852Z p_assert( 2022-11-23T02:48:20.3495181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3495312Z traceback.print_stack() 2022-11-23T02:48:20.3495615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.3495870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.3496278Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3496688Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3496823Z File "", line 1, in 2022-11-23T02:48:20.3497041Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3497176Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3497386Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3497538Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3497761Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3497872Z self.run() 2022-11-23T02:48:20.3498081Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3498231Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3498581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3498701Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3499074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3499206Z getattr(self, test_name)() 2022-11-23T02:48:20.3499574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3499674Z fn() 2022-11-23T02:48:20.3500053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3500182Z test(self, **param_kwargs) 2022-11-23T02:48:20.3500547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3500660Z return func(*args, **kwargs) 2022-11-23T02:48:20.3500946Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3501064Z self.run_subtests( 2022-11-23T02:48:20.3501430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3501661Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3502035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3502192Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3502580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3502690Z output = model(*input) 2022-11-23T02:48:20.3503023Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3503168Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3503555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3503735Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3504112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3504246Z _lazy_init(state, module) 2022-11-23T02:48:20.3504611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3504741Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3505160Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3505300Z return func(*args, **kwargs) 2022-11-23T02:48:20.3505690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3505795Z p_assert( 2022-11-23T02:48:20.3506144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3506275Z traceback.print_stack() 2022-11-23T02:48:20.3506389Z File "", line 1, in 2022-11-23T02:48:20.3506613Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3506761Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3506970Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3507125Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3507349Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3507459Z self.run() 2022-11-23T02:48:20.3507667Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3507798Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3508151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3508291Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3508666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3508799Z getattr(self, test_name)() 2022-11-23T02:48:20.3509170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3509274Z fn() 2022-11-23T02:48:20.3509655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3509770Z test(self, **param_kwargs) 2022-11-23T02:48:20.3510142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3510273Z return func(*args, **kwargs) 2022-11-23T02:48:20.3510563Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3510686Z self.run_subtests( 2022-11-23T02:48:20.3511050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3511336Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3511714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3511856Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3512244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3512370Z output = model(*input) 2022-11-23T02:48:20.3512706Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3512855Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3513242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3513424Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3513809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3513917Z _lazy_init(state, module) 2022-11-23T02:48:20.3514281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3514428Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3514829Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3514965Z return func(*args, **kwargs) 2022-11-23T02:48:20.3515593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3515703Z p_assert( 2022-11-23T02:48:20.3516050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3516163Z traceback.print_stack() 2022-11-23T02:48:20.3516421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.3516672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.3517084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3517224Z File "", line 1, in 2022-11-23T02:48:20.3517439Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3517586Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3517793Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3517929Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3518147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3518255Z self.run() 2022-11-23T02:48:20.3518466Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3518623Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3518977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3519115Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3519494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3519604Z getattr(self, test_name)() 2022-11-23T02:48:20.3519976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3520078Z fn() 2022-11-23T02:48:20.3520452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3520579Z test(self, **param_kwargs) 2022-11-23T02:48:20.3520946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3521168Z return func(*args, **kwargs) 2022-11-23T02:48:20.3521456Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3521555Z self.run_subtests( 2022-11-23T02:48:20.3521927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3522099Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3522474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3522632Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3523021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3523148Z output = model(*input) 2022-11-23T02:48:20.3523489Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3523618Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3524005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3524247Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3524640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3524765Z _lazy_init(state, module) 2022-11-23T02:48:20.3525126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3525271Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3525618Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3525733Z return func(*args, **kwargs) 2022-11-23T02:48:20.3526126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3526234Z p_assert( 2022-11-23T02:48:20.3526582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3526716Z traceback.print_stack() 2022-11-23T02:48:20.3527133Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3527268Z File "", line 1, in 2022-11-23T02:48:20.3527486Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3527615Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3527823Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3527979Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3528202Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3528312Z self.run() 2022-11-23T02:48:20.3528523Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3528677Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3529014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3529151Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3529523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3529651Z getattr(self, test_name)() 2022-11-23T02:48:20.3530018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3530123Z fn() 2022-11-23T02:48:20.3530497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3530685Z test(self, **param_kwargs) 2022-11-23T02:48:20.3531040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3531165Z return func(*args, **kwargs) 2022-11-23T02:48:20.3531454Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3531573Z self.run_subtests( 2022-11-23T02:48:20.3531943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3532114Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3532488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3532644Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3533014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3533139Z output = model(*input) 2022-11-23T02:48:20.3533477Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3533622Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3534054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3534247Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3534626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3534752Z _lazy_init(state, module) 2022-11-23T02:48:20.3535093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3535242Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3535593Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3535723Z return func(*args, **kwargs) 2022-11-23T02:48:20.3536115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3536222Z p_assert( 2022-11-23T02:48:20.3536571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3536703Z traceback.print_stack() 2022-11-23T02:48:20.3536938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.3537186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.3537599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3538016Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3538154Z File "", line 1, in 2022-11-23T02:48:20.3538371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3538518Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3538734Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3538871Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3539092Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3539198Z self.run() 2022-11-23T02:48:20.3539406Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3539558Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3539909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3540110Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3540484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3540592Z getattr(self, test_name)() 2022-11-23T02:48:20.3540966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3541069Z fn() 2022-11-23T02:48:20.3541441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3541568Z test(self, **param_kwargs) 2022-11-23T02:48:20.3541934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3542063Z return func(*args, **kwargs) 2022-11-23T02:48:20.3542345Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3542448Z self.run_subtests( 2022-11-23T02:48:20.3542814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3542983Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3543410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3543578Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3543967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3544092Z output = model(*input) 2022-11-23T02:48:20.3544427Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3544555Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3544947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3545130Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3545505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3545629Z _lazy_init(state, module) 2022-11-23T02:48:20.3545993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3546141Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3546489Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3546600Z return func(*args, **kwargs) 2022-11-23T02:48:20.3546990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3547103Z p_assert( 2022-11-23T02:48:20.3547450Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3547580Z traceback.print_stack() 2022-11-23T02:48:20.3547712Z File "", line 1, in 2022-11-23T02:48:20.3547924Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3548073Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3548264Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3548418Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3548637Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3548743Z self.run() 2022-11-23T02:48:20.3548952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3549102Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3549449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3549638Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3550018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3550143Z getattr(self, test_name)() 2022-11-23T02:48:20.3550514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3550618Z fn() 2022-11-23T02:48:20.3550994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3551121Z test(self, **param_kwargs) 2022-11-23T02:48:20.3551485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3551596Z return func(*args, **kwargs) 2022-11-23T02:48:20.3551880Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3552001Z self.run_subtests( 2022-11-23T02:48:20.3552365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3552532Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3552958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3553125Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3553512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3553616Z output = model(*input) 2022-11-23T02:48:20.3553951Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3554096Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3554486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3554668Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3555263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3555405Z _lazy_init(state, module) 2022-11-23T02:48:20.3555774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3555902Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3556248Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3556376Z return func(*args, **kwargs) 2022-11-23T02:48:20.3556763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3556876Z p_assert( 2022-11-23T02:48:20.3557226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3557357Z traceback.print_stack() 2022-11-23T02:48:20.3557608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.3557840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.3558253Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3558661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3558912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.3559154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.3559668Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3560077Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3560332Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.3560576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.3560979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3561365Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3561612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.3561857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.3562261Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3562721Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3563505Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3563754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.3563995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.3564400Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3564805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3565037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.3565278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.3565680Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3566081Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3566327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.3566573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.3566976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3567382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3567633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.3567855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.3568259Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3568660Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3568971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.3569208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.3569613Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3570020Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3570267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.3570505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.3570907Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3571292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3571538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.3571778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.3572226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3572638Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3572882Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.3573122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.3573522Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3573966Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3574194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:48:20.3574436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:48:20.3574837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.3575239Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.3575483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:48:20.3575722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:48:20.3576127Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.3576529Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.3576777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:48:20.3576999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:48:20.3577398Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.3577799Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.3578046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:48:20.3578356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:48:20.3578758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.3579157Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.3579405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:48:20.3579647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:48:20.3580046Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.3580428Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.3580683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:48:20.3580925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:48:20.3581325Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.3581770Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.3582022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:48:20.3582260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:48:20.3582659Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.3583056Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.3583287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:48:20.3583527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:48:20.3583930Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.3584330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.3584570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:48:20.3584806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:48:20.3585208Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.3585611Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.3585856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:48:20.3586103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:48:20.3586486Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.3586879Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.3587119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:48:20.3587359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:48:20.3587822Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.3588218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.3588463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:48:20.3588701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:48:20.3589100Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.3589477Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.3589596Z dist init r=1, world=2 2022-11-23T02:48:20.3589936Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3590271Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3590638Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3590964Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3591276Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3591585Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3591899Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3592211Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3592521Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3592838Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3593162Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3593484Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3593599Z dist init r=0, world=2 2022-11-23T02:48:20.3593914Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3594231Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3594542Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3594854Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3595445Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3595760Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3596073Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3596366Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3596678Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3596987Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3597303Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3597689Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3597808Z ok (29.851s) 2022-11-23T02:48:20.3610382Z test_mixture_of_experts_with_delay_before_free_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89119 2022-11-23T02:48:20.3610874Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89120 2022-11-23T02:48:20.3611657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3611979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3612686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3612954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3613356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3613539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3613930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3614129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3614385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.3614687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.3615285Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3615710Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3615953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.3616185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.3617218Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3617511Z warnings.warn( 2022-11-23T02:48:20.3617766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.3618796Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3618914Z warnings.warn( 2022-11-23T02:48:20.3619159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.3619565Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3619976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3620094Z File "", line 1, in 2022-11-23T02:48:20.3620312Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3620461Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3620734Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3620906Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3621128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3621236Z self.run() 2022-11-23T02:48:20.3621425Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3621579Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3621936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3622079Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3622453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3622581Z getattr(self, test_name)() 2022-11-23T02:48:20.3622957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3623061Z fn() 2022-11-23T02:48:20.3623419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3623546Z test(self, **param_kwargs) 2022-11-23T02:48:20.3623913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3624044Z return func(*args, **kwargs) 2022-11-23T02:48:20.3624329Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3624450Z self.run_subtests( 2022-11-23T02:48:20.3624814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3624981Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3625343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3625502Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3625885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3626009Z output = model(*input) 2022-11-23T02:48:20.3626347Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3626493Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3626950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3627134Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3627513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3627623Z _lazy_init(state, module) 2022-11-23T02:48:20.3627986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3628134Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3628480Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3628609Z return func(*args, **kwargs) 2022-11-23T02:48:20.3628994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3629105Z p_assert( 2022-11-23T02:48:20.3629434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3629567Z traceback.print_stack() 2022-11-23T02:48:20.3629699Z File "", line 1, in 2022-11-23T02:48:20.3629912Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3630110Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3630332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3630489Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3630708Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3630796Z self.run() 2022-11-23T02:48:20.3631001Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3631150Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3631508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3631647Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3632020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3632148Z getattr(self, test_name)() 2022-11-23T02:48:20.3632519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3632604Z fn() 2022-11-23T02:48:20.3632978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3633103Z test(self, **param_kwargs) 2022-11-23T02:48:20.3633468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3633597Z return func(*args, **kwargs) 2022-11-23T02:48:20.3633881Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3634002Z self.run_subtests( 2022-11-23T02:48:20.3634364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3634515Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3634890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3635811Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3636278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3636404Z output = model(*input) 2022-11-23T02:48:20.3636738Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3636884Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3637405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3637567Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3637944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3638074Z _lazy_init(state, module) 2022-11-23T02:48:20.3638435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3638582Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3638925Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3639054Z return func(*args, **kwargs) 2022-11-23T02:48:20.3639444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3639538Z p_assert( 2022-11-23T02:48:20.3639883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3640014Z traceback.print_stack() 2022-11-23T02:48:20.3640267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.3640587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.3641012Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3641148Z File "", line 1, in 2022-11-23T02:48:20.3641364Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3641492Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3641702Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3641863Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3642082Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3642188Z self.run() 2022-11-23T02:48:20.3642398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3642547Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3642885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3643022Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3643391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3643517Z getattr(self, test_name)() 2022-11-23T02:48:20.3643888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3643993Z fn() 2022-11-23T02:48:20.3644373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3644499Z test(self, **param_kwargs) 2022-11-23T02:48:20.3644845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3644974Z return func(*args, **kwargs) 2022-11-23T02:48:20.3645262Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3645379Z self.run_subtests( 2022-11-23T02:48:20.3645739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3645906Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3646276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3646502Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3646874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3647001Z output = model(*input) 2022-11-23T02:48:20.3647331Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3647480Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3647869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3648050Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3648427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3648553Z _lazy_init(state, module) 2022-11-23T02:48:20.3648897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3649047Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3649394Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3649524Z return func(*args, **kwargs) 2022-11-23T02:48:20.3649964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3650082Z p_assert( 2022-11-23T02:48:20.3650431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3650560Z traceback.print_stack() 2022-11-23T02:48:20.3650948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3651084Z File "", line 1, in 2022-11-23T02:48:20.3651299Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3651454Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3651663Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3651817Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3652036Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3652143Z self.run() 2022-11-23T02:48:20.3652334Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3652486Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3652838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3652975Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3653345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3653473Z getattr(self, test_name)() 2022-11-23T02:48:20.3653847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3653949Z fn() 2022-11-23T02:48:20.3654301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3654429Z test(self, **param_kwargs) 2022-11-23T02:48:20.3654800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3654928Z return func(*args, **kwargs) 2022-11-23T02:48:20.3655203Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3655321Z self.run_subtests( 2022-11-23T02:48:20.3655674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3655830Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3656255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3656403Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3656784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3656903Z output = model(*input) 2022-11-23T02:48:20.3657238Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3657384Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3657772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3657954Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3658310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3658440Z _lazy_init(state, module) 2022-11-23T02:48:20.3658801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3658948Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3659346Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3659488Z return func(*args, **kwargs) 2022-11-23T02:48:20.3659882Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3659990Z p_assert( 2022-11-23T02:48:20.3660315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3660446Z traceback.print_stack() 2022-11-23T02:48:20.3660699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.3660954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.3661362Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3661494Z File "", line 1, in 2022-11-23T02:48:20.3661713Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3661859Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3662049Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3662203Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3662417Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3662524Z self.run() 2022-11-23T02:48:20.3662732Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3662883Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3663241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3663378Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3663733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3663863Z getattr(self, test_name)() 2022-11-23T02:48:20.3664231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3664334Z fn() 2022-11-23T02:48:20.3664706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3664833Z test(self, **param_kwargs) 2022-11-23T02:48:20.3665194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3665305Z return func(*args, **kwargs) 2022-11-23T02:48:20.3665653Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3665772Z self.run_subtests( 2022-11-23T02:48:20.3666135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3666305Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3666678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3666835Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3667221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3667346Z output = model(*input) 2022-11-23T02:48:20.3667661Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3667812Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3668196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3668376Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3668829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3668969Z _lazy_init(state, module) 2022-11-23T02:48:20.3669334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3669481Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3669806Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3669936Z return func(*args, **kwargs) 2022-11-23T02:48:20.3670319Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3670431Z p_assert( 2022-11-23T02:48:20.3670777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3670906Z traceback.print_stack() 2022-11-23T02:48:20.3671315Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3671446Z File "", line 1, in 2022-11-23T02:48:20.3671643Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3671783Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3671987Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3672140Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3672355Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3672469Z self.run() 2022-11-23T02:48:20.3672675Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3672867Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3673223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3673359Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3673733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3673867Z getattr(self, test_name)() 2022-11-23T02:48:20.3674269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3674373Z fn() 2022-11-23T02:48:20.3674749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3674857Z test(self, **param_kwargs) 2022-11-23T02:48:20.3675976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3676111Z return func(*args, **kwargs) 2022-11-23T02:48:20.3676396Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3676516Z self.run_subtests( 2022-11-23T02:48:20.3676893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3677063Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3677436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3677576Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3677958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3678087Z output = model(*input) 2022-11-23T02:48:20.3678423Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3678569Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3678955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3679231Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3679629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3679737Z _lazy_init(state, module) 2022-11-23T02:48:20.3680097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3680245Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3680592Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3680728Z return func(*args, **kwargs) 2022-11-23T02:48:20.3681115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3681224Z p_assert( 2022-11-23T02:48:20.3681574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3681688Z traceback.print_stack() 2022-11-23T02:48:20.3681940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.3682190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.3682601Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3682737Z File "", line 1, in 2022-11-23T02:48:20.3682958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3683105Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3683313Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3683449Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3683670Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3683776Z self.run() 2022-11-23T02:48:20.3683981Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3684129Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3684478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3684616Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3684986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3685179Z getattr(self, test_name)() 2022-11-23T02:48:20.3685554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3685658Z fn() 2022-11-23T02:48:20.3686031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3686164Z test(self, **param_kwargs) 2022-11-23T02:48:20.3686532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3686660Z return func(*args, **kwargs) 2022-11-23T02:48:20.3686944Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3687042Z self.run_subtests( 2022-11-23T02:48:20.3687401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3687573Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3687948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3688104Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3688538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3688677Z output = model(*input) 2022-11-23T02:48:20.3689017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3689145Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3689531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3689714Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3690091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3690222Z _lazy_init(state, module) 2022-11-23T02:48:20.3690581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3690730Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3691081Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3691193Z return func(*args, **kwargs) 2022-11-23T02:48:20.3691583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3691689Z p_assert( 2022-11-23T02:48:20.3692033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3692166Z traceback.print_stack() 2022-11-23T02:48:20.3692575Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3692713Z File "", line 1, in 2022-11-23T02:48:20.3692933Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3693062Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3693275Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3693431Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3693648Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3693755Z self.run() 2022-11-23T02:48:20.3693962Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3694111Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3694442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3694643Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3695018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3695146Z getattr(self, test_name)() 2022-11-23T02:48:20.3695510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3695615Z fn() 2022-11-23T02:48:20.3695987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3696112Z test(self, **param_kwargs) 2022-11-23T02:48:20.3696462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3696590Z return func(*args, **kwargs) 2022-11-23T02:48:20.3696873Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3696996Z self.run_subtests( 2022-11-23T02:48:20.3697358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3697524Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3697896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3698105Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3698485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3698615Z output = model(*input) 2022-11-23T02:48:20.3698950Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3699096Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3699481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3699670Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3700050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3700176Z _lazy_init(state, module) 2022-11-23T02:48:20.3700524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3700672Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3701022Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3701150Z return func(*args, **kwargs) 2022-11-23T02:48:20.3701538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3701645Z p_assert( 2022-11-23T02:48:20.3701991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3702125Z traceback.print_stack() 2022-11-23T02:48:20.3702358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.3702608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.3703019Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3703154Z File "", line 1, in 2022-11-23T02:48:20.3703369Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3703517Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3703727Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3703884Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3704082Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3704267Z self.run() 2022-11-23T02:48:20.3704476Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3704626Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3704980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3705123Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3705495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3705623Z getattr(self, test_name)() 2022-11-23T02:48:20.3705969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3706073Z fn() 2022-11-23T02:48:20.3706443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3706574Z test(self, **param_kwargs) 2022-11-23T02:48:20.3706936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3707064Z return func(*args, **kwargs) 2022-11-23T02:48:20.3707406Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3707538Z self.run_subtests( 2022-11-23T02:48:20.3707887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3708055Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3708425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3708583Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3708968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3709098Z output = model(*input) 2022-11-23T02:48:20.3709430Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3709577Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3709948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3710131Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3710507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3710633Z _lazy_init(state, module) 2022-11-23T02:48:20.3710991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3711140Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3711490Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3711619Z return func(*args, **kwargs) 2022-11-23T02:48:20.3711985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3712091Z p_assert( 2022-11-23T02:48:20.3712441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3712572Z traceback.print_stack() 2022-11-23T02:48:20.3712979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.3713112Z File "", line 1, in 2022-11-23T02:48:20.3713326Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3713471Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3713661Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3713888Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3714109Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3714217Z self.run() 2022-11-23T02:48:20.3714423Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3714578Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3714930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3715605Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3716065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3716193Z getattr(self, test_name)() 2022-11-23T02:48:20.3716556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3716662Z fn() 2022-11-23T02:48:20.3717032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3717161Z test(self, **param_kwargs) 2022-11-23T02:48:20.3717526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3717727Z return func(*args, **kwargs) 2022-11-23T02:48:20.3718028Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3718138Z self.run_subtests( 2022-11-23T02:48:20.3718502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3718670Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3719045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3719204Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3719589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3719696Z output = model(*input) 2022-11-23T02:48:20.3720036Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3720185Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3720571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3720751Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3721126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3721250Z _lazy_init(state, module) 2022-11-23T02:48:20.3721615Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3721763Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3722094Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3722224Z return func(*args, **kwargs) 2022-11-23T02:48:20.3722617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3722725Z p_assert( 2022-11-23T02:48:20.3723068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3723197Z traceback.print_stack() 2022-11-23T02:48:20.3723446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.3723676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.3724174Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3724309Z File "", line 1, in 2022-11-23T02:48:20.3724527Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3724674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3724887Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3725043Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3725262Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3725350Z self.run() 2022-11-23T02:48:20.3725557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3725706Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3726057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3726193Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3726568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3726696Z getattr(self, test_name)() 2022-11-23T02:48:20.3727115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3727208Z fn() 2022-11-23T02:48:20.3727585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3727711Z test(self, **param_kwargs) 2022-11-23T02:48:20.3728073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3728203Z return func(*args, **kwargs) 2022-11-23T02:48:20.3728487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3728611Z self.run_subtests( 2022-11-23T02:48:20.3728975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3729126Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3729500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3729660Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3730044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3730167Z output = model(*input) 2022-11-23T02:48:20.3730504Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3730654Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3731044Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3731214Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3731592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3731717Z _lazy_init(state, module) 2022-11-23T02:48:20.3732081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3732229Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3732577Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3732707Z return func(*args, **kwargs) 2022-11-23T02:48:20.3733102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3733191Z p_assert( 2022-11-23T02:48:20.3733622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3733753Z traceback.print_stack() 2022-11-23T02:48:20.3734161Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.3734297Z File "", line 1, in 2022-11-23T02:48:20.3734517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3734665Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3734874Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3735010Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3735228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3735337Z self.run() 2022-11-23T02:48:20.3735542Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3735697Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3736048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3736186Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3736606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3736725Z getattr(self, test_name)() 2022-11-23T02:48:20.3737099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3737201Z fn() 2022-11-23T02:48:20.3737572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3737700Z test(self, **param_kwargs) 2022-11-23T02:48:20.3738068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3738202Z return func(*args, **kwargs) 2022-11-23T02:48:20.3738469Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3738587Z self.run_subtests( 2022-11-23T02:48:20.3738955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3739126Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3739501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3739658Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3740044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3740171Z output = model(*input) 2022-11-23T02:48:20.3740506Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3740640Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3741028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3741210Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3741591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3741717Z _lazy_init(state, module) 2022-11-23T02:48:20.3742078Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3742227Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3742576Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3742688Z return func(*args, **kwargs) 2022-11-23T02:48:20.3743144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3743251Z p_assert( 2022-11-23T02:48:20.3743597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3743730Z traceback.print_stack() 2022-11-23T02:48:20.3743987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.3744238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.3744651Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3744768Z File "", line 1, in 2022-11-23T02:48:20.3744983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3745130Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3745343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3745499Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3745716Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3745822Z self.run() 2022-11-23T02:48:20.3746063Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3746225Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3746578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3746716Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3747086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3747212Z getattr(self, test_name)() 2022-11-23T02:48:20.3747577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3747685Z fn() 2022-11-23T02:48:20.3748042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3748171Z test(self, **param_kwargs) 2022-11-23T02:48:20.3748542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3748672Z return func(*args, **kwargs) 2022-11-23T02:48:20.3748957Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3749074Z self.run_subtests( 2022-11-23T02:48:20.3749438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3749605Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3749964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3750124Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3750509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3750634Z output = model(*input) 2022-11-23T02:48:20.3750972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3751120Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3751505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3751684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3752043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3752231Z _lazy_init(state, module) 2022-11-23T02:48:20.3752597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3752746Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3753094Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3753227Z return func(*args, **kwargs) 2022-11-23T02:48:20.3753613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3753719Z p_assert( 2022-11-23T02:48:20.3754042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3754173Z traceback.print_stack() 2022-11-23T02:48:20.3754579Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.3754718Z File "", line 1, in 2022-11-23T02:48:20.3754933Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3755633Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3755904Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3756060Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3756345Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3756469Z self.run() 2022-11-23T02:48:20.3756680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3756833Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3757200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3757340Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3757710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3757845Z getattr(self, test_name)() 2022-11-23T02:48:20.3758198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3758302Z fn() 2022-11-23T02:48:20.3758682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3758811Z test(self, **param_kwargs) 2022-11-23T02:48:20.3759173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3759304Z return func(*args, **kwargs) 2022-11-23T02:48:20.3759589Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3759708Z self.run_subtests( 2022-11-23T02:48:20.3760056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3760231Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3760600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3760760Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3761148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3761275Z output = model(*input) 2022-11-23T02:48:20.3761612Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3761760Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3762126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3762311Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3762780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3762906Z _lazy_init(state, module) 2022-11-23T02:48:20.3763267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3763421Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3763765Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3763890Z return func(*args, **kwargs) 2022-11-23T02:48:20.3764264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3764370Z p_assert( 2022-11-23T02:48:20.3764717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3764847Z traceback.print_stack() 2022-11-23T02:48:20.3765103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.3765352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.3765763Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3765950Z File "", line 1, in 2022-11-23T02:48:20.3766159Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3766308Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3766517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3766672Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3766890Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3766996Z self.run() 2022-11-23T02:48:20.3767208Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3767340Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3767699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3767836Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3768215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3768344Z getattr(self, test_name)() 2022-11-23T02:48:20.3768712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3768816Z fn() 2022-11-23T02:48:20.3769189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3769297Z test(self, **param_kwargs) 2022-11-23T02:48:20.3769661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3769795Z return func(*args, **kwargs) 2022-11-23T02:48:20.3770085Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3770204Z self.run_subtests( 2022-11-23T02:48:20.3770571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3770741Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3771113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3771252Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3771642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3771846Z output = model(*input) 2022-11-23T02:48:20.3772188Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3772334Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3772719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3772906Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3773283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3773408Z _lazy_init(state, module) 2022-11-23T02:48:20.3773751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3773898Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3774294Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3774431Z return func(*args, **kwargs) 2022-11-23T02:48:20.3774823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3774933Z p_assert( 2022-11-23T02:48:20.3775279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3775444Z traceback.print_stack() 2022-11-23T02:48:20.3775868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.3776005Z File "", line 1, in 2022-11-23T02:48:20.3776227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3776375Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3776586Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3776743Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3776966Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3777056Z self.run() 2022-11-23T02:48:20.3777264Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3777413Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3777766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3777905Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3778278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3778406Z getattr(self, test_name)() 2022-11-23T02:48:20.3778772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3778856Z fn() 2022-11-23T02:48:20.3779232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3779366Z test(self, **param_kwargs) 2022-11-23T02:48:20.3779733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3779863Z return func(*args, **kwargs) 2022-11-23T02:48:20.3780151Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3780267Z self.run_subtests( 2022-11-23T02:48:20.3780625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3780775Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3781146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3781305Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3781755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3781881Z output = model(*input) 2022-11-23T02:48:20.3782217Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3782362Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3782749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3782915Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3783295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3783420Z _lazy_init(state, module) 2022-11-23T02:48:20.3783777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3783929Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3784275Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3784404Z return func(*args, **kwargs) 2022-11-23T02:48:20.3784788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3784928Z p_assert( 2022-11-23T02:48:20.3785289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3785423Z traceback.print_stack() 2022-11-23T02:48:20.3785676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.3785921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.3786332Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3786472Z File "", line 1, in 2022-11-23T02:48:20.3786687Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3786816Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3787028Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3787186Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3787405Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3787515Z self.run() 2022-11-23T02:48:20.3787722Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3787872Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3788227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3788347Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3788721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3788849Z getattr(self, test_name)() 2022-11-23T02:48:20.3789215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3789317Z fn() 2022-11-23T02:48:20.3789691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3789820Z test(self, **param_kwargs) 2022-11-23T02:48:20.3790167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3790296Z return func(*args, **kwargs) 2022-11-23T02:48:20.3790582Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3790703Z self.run_subtests( 2022-11-23T02:48:20.3791133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3791302Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3791676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3791839Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3792230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3792338Z output = model(*input) 2022-11-23T02:48:20.3792673Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3792818Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3793201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3793386Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3793762Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3793887Z _lazy_init(state, module) 2022-11-23T02:48:20.3794298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3794438Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3794789Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3794922Z return func(*args, **kwargs) 2022-11-23T02:48:20.3796185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3796298Z p_assert( 2022-11-23T02:48:20.3796653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3796795Z traceback.print_stack() 2022-11-23T02:48:20.3797209Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.3797327Z File "", line 1, in 2022-11-23T02:48:20.3797546Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3797697Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3797906Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3798061Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3798278Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3798384Z self.run() 2022-11-23T02:48:20.3798574Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3798724Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3799081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3799217Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3799588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3799716Z getattr(self, test_name)() 2022-11-23T02:48:20.3800086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3800189Z fn() 2022-11-23T02:48:20.3800545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3800672Z test(self, **param_kwargs) 2022-11-23T02:48:20.3801035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3801161Z return func(*args, **kwargs) 2022-11-23T02:48:20.3801566Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3801684Z self.run_subtests( 2022-11-23T02:48:20.3802053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3802222Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3802580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3802744Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3803127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3803251Z output = model(*input) 2022-11-23T02:48:20.3803585Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3803735Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3804123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3804303Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3804733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3804872Z _lazy_init(state, module) 2022-11-23T02:48:20.3805239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3805389Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3805733Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3805863Z return func(*args, **kwargs) 2022-11-23T02:48:20.3806252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3806363Z p_assert( 2022-11-23T02:48:20.3806688Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3806817Z traceback.print_stack() 2022-11-23T02:48:20.3807069Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.3807319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.3807729Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3807862Z File "", line 1, in 2022-11-23T02:48:20.3808076Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3808221Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3808411Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3808569Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3808788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3808896Z self.run() 2022-11-23T02:48:20.3809104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3809258Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3809613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3809754Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3810104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3810232Z getattr(self, test_name)() 2022-11-23T02:48:20.3810603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3810767Z fn() 2022-11-23T02:48:20.3811145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3811272Z test(self, **param_kwargs) 2022-11-23T02:48:20.3811635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3811763Z return func(*args, **kwargs) 2022-11-23T02:48:20.3812032Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3812148Z self.run_subtests( 2022-11-23T02:48:20.3812511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3812676Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3813049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3813213Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3813600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3813723Z output = model(*input) 2022-11-23T02:48:20.3814041Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3814240Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3814638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3814819Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3815197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3815322Z _lazy_init(state, module) 2022-11-23T02:48:20.3815680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3815834Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3816159Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3816289Z return func(*args, **kwargs) 2022-11-23T02:48:20.3816680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3816787Z p_assert( 2022-11-23T02:48:20.3817133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3817264Z traceback.print_stack() 2022-11-23T02:48:20.3817677Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.3817810Z File "", line 1, in 2022-11-23T02:48:20.3818005Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3818154Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3818359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3818513Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3818730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3818841Z self.run() 2022-11-23T02:48:20.3819049Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3819182Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3819531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3819673Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3820044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3820173Z getattr(self, test_name)() 2022-11-23T02:48:20.3820636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3820741Z fn() 2022-11-23T02:48:20.3821115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3821224Z test(self, **param_kwargs) 2022-11-23T02:48:20.3821598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3821731Z return func(*args, **kwargs) 2022-11-23T02:48:20.3822017Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3822133Z self.run_subtests( 2022-11-23T02:48:20.3822496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3822664Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3823042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3823179Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3823565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3823738Z output = model(*input) 2022-11-23T02:48:20.3824087Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3824233Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3824616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3824799Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3825173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3825287Z _lazy_init(state, module) 2022-11-23T02:48:20.3825647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3825796Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3826149Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3826278Z return func(*args, **kwargs) 2022-11-23T02:48:20.3826670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3826778Z p_assert( 2022-11-23T02:48:20.3827125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3827237Z traceback.print_stack() 2022-11-23T02:48:20.3827487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.3827736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.3828145Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3828280Z File "", line 1, in 2022-11-23T02:48:20.3828502Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3828650Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3828859Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3828993Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3829210Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3829317Z self.run() 2022-11-23T02:48:20.3829524Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3829737Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3830095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3830234Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3830600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3830712Z getattr(self, test_name)() 2022-11-23T02:48:20.3831082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3831183Z fn() 2022-11-23T02:48:20.3831556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3831684Z test(self, **param_kwargs) 2022-11-23T02:48:20.3832046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3832180Z return func(*args, **kwargs) 2022-11-23T02:48:20.3832465Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3832563Z self.run_subtests( 2022-11-23T02:48:20.3832927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3833138Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3833522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3833679Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3834063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3834188Z output = model(*input) 2022-11-23T02:48:20.3834525Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3834658Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3835597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3835856Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3836247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3836374Z _lazy_init(state, module) 2022-11-23T02:48:20.3836736Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3836882Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3837231Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3837341Z return func(*args, **kwargs) 2022-11-23T02:48:20.3837729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3837841Z p_assert( 2022-11-23T02:48:20.3838184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3838317Z traceback.print_stack() 2022-11-23T02:48:20.3838732Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.3838867Z File "", line 1, in 2022-11-23T02:48:20.3839081Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3839209Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3839415Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3839569Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3839787Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3840011Z self.run() 2022-11-23T02:48:20.3840221Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3840374Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3840709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3840852Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3841225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3841353Z getattr(self, test_name)() 2022-11-23T02:48:20.3841723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3841827Z fn() 2022-11-23T02:48:20.3842203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3842335Z test(self, **param_kwargs) 2022-11-23T02:48:20.3842684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3842812Z return func(*args, **kwargs) 2022-11-23T02:48:20.3843096Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3843269Z self.run_subtests( 2022-11-23T02:48:20.3843652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3843820Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3844193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3844351Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3844741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3844851Z output = model(*input) 2022-11-23T02:48:20.3845186Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3845334Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3845723Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3845906Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3846284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3846413Z _lazy_init(state, module) 2022-11-23T02:48:20.3846775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3846903Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3847248Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3847382Z return func(*args, **kwargs) 2022-11-23T02:48:20.3847775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3847881Z p_assert( 2022-11-23T02:48:20.3848226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3848357Z traceback.print_stack() 2022-11-23T02:48:20.3848591Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.3848834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.3849246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3849379Z File "", line 1, in 2022-11-23T02:48:20.3849655Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3849803Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3850014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3850171Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3850380Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3850493Z self.run() 2022-11-23T02:48:20.3850821Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3850998Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3851452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3851592Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3851964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3852111Z getattr(self, test_name)() 2022-11-23T02:48:20.3852591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3852787Z fn() 2022-11-23T02:48:20.3853164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3853352Z test(self, **param_kwargs) 2022-11-23T02:48:20.3853868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3854274Z return func(*args, **kwargs) 2022-11-23T02:48:20.3854566Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3854685Z self.run_subtests( 2022-11-23T02:48:20.3855037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3855210Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3855585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3855745Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3856135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3856262Z output = model(*input) 2022-11-23T02:48:20.3856597Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3856741Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3857104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3857288Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3857669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3857796Z _lazy_init(state, module) 2022-11-23T02:48:20.3858156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3858305Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3858654Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3858784Z return func(*args, **kwargs) 2022-11-23T02:48:20.3859154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3859261Z p_assert( 2022-11-23T02:48:20.3859611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3859745Z traceback.print_stack() 2022-11-23T02:48:20.3860234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.3860371Z File "", line 1, in 2022-11-23T02:48:20.3860590Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3860737Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3860931Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3861087Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3861363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3861472Z self.run() 2022-11-23T02:48:20.3861680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3861829Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3862178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3862320Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3862674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3862804Z getattr(self, test_name)() 2022-11-23T02:48:20.3863179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3863325Z fn() 2022-11-23T02:48:20.3863710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3863843Z test(self, **param_kwargs) 2022-11-23T02:48:20.3864206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3864334Z return func(*args, **kwargs) 2022-11-23T02:48:20.3864601Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3864730Z self.run_subtests( 2022-11-23T02:48:20.3865491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3865851Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3866467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3866635Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3867024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3867149Z output = model(*input) 2022-11-23T02:48:20.3867469Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3867617Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3868005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3868191Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3868574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3868702Z _lazy_init(state, module) 2022-11-23T02:48:20.3869066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3869213Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3869544Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3869675Z return func(*args, **kwargs) 2022-11-23T02:48:20.3870062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3870170Z p_assert( 2022-11-23T02:48:20.3870601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3870733Z traceback.print_stack() 2022-11-23T02:48:20.3870988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.3871236Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.3871635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3872049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.3872297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.3872537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.3872943Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3873352Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.3873598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.3873884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.3874347Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3874732Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.3874979Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.3875771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.3876208Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3876610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.3877376Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3877632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.3877871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.3878273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3878684Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.3879443Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3879679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.3879920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.3880324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3880836Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.3881083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.3881324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.3881730Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3882134Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.3882379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.3882620Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.3883001Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3883407Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.3883651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.3883947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.3884366Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3884763Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.3885007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.3885249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.3885660Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3886043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.3886295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.3886538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.3886940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3887336Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.3888096Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3888354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.3888598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.3889000Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3889400Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.3889632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:48:20.3889872Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:48:20.3890340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.3890739Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.3890989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:48:20.3891389Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.3891638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:48:20.3892038Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.3892282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:48:20.3892686Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.3892914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:48:20.3893352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.3893608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:48:20.3893848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:48:20.3894252Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.3894650Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.3894904Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:48:20.3895306Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.3895553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:48:20.3895931Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.3896686Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3896937Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:48:20.3897180Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:48:20.3897577Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.3897980Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.3898226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:48:20.3898624Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.3898869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:48:20.3899266Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.3900089Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3900323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:48:20.3900560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:48:20.3900961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.3901359Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.3901607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:48:20.3902012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.3902259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:48:20.3902704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.3902958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:48:20.3903364Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.3903591Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:48:20.3903988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.3904240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:48:20.3904645Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.3904897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:48:20.3905295Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.3906050Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.3906300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:48:20.3906545Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:48:20.3906946Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.3907326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.3907447Z dist init r=1, world=2 2022-11-23T02:48:20.3907786Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3908109Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3908422Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3908810Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3909126Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3909439Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3909748Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3910055Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3910367Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3910656Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3911022Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3911342Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.3911462Z dist init r=0, world=2 2022-11-23T02:48:20.3911795Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3912122Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3912436Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3912753Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3913065Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3913373Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3913678Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3913973Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3914284Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3914591Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3914897Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3915887Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.3916111Z ok (30.353s) 2022-11-23T02:48:20.3916488Z test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89466 2022-11-23T02:48:20.3916720Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89467 2022-11-23T02:48:20.3917124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3917306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3917680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3917879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3918264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.3918448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.3918837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.3919032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.3919339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.3919608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.3920019Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3920405Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.3920645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.3920883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.3921921Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3922043Z warnings.warn( 2022-11-23T02:48:20.3922291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:20.3923307Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.3923426Z warnings.warn( 2022-11-23T02:48:20.3923675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:20.3924076Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3924476Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:20.3924595Z File "", line 1, in 2022-11-23T02:48:20.3924814Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3924961Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3925234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3925393Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3925616Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3925728Z self.run() 2022-11-23T02:48:20.3925940Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3926072Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3926428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3926567Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3926943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3927072Z getattr(self, test_name)() 2022-11-23T02:48:20.3927442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3927550Z fn() 2022-11-23T02:48:20.3927929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3928040Z test(self, **param_kwargs) 2022-11-23T02:48:20.3928451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3928593Z return func(*args, **kwargs) 2022-11-23T02:48:20.3928880Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3929001Z self.run_subtests( 2022-11-23T02:48:20.3929365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3929533Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3929906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3930053Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3930440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3930564Z output = model(*input) 2022-11-23T02:48:20.3930901Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3931048Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3931434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3931619Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3931993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3932105Z _lazy_init(state, module) 2022-11-23T02:48:20.3932466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3932612Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3932959Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3933088Z return func(*args, **kwargs) 2022-11-23T02:48:20.3933482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3933593Z p_assert( 2022-11-23T02:48:20.3933941Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3934053Z traceback.print_stack() 2022-11-23T02:48:20.3934184Z File "", line 1, in 2022-11-23T02:48:20.3934399Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3934605Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3934815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3934973Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3935190Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3935278Z self.run() 2022-11-23T02:48:20.3935491Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3935644Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3935997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3936135Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3936506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3936632Z getattr(self, test_name)() 2022-11-23T02:48:20.3937000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3937087Z fn() 2022-11-23T02:48:20.3937459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3937584Z test(self, **param_kwargs) 2022-11-23T02:48:20.3937992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3938135Z return func(*args, **kwargs) 2022-11-23T02:48:20.3938425Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3938542Z self.run_subtests( 2022-11-23T02:48:20.3938906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3939056Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3939436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3939593Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3939981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3940107Z output = model(*input) 2022-11-23T02:48:20.3940445Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3940592Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3940979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3941144Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3941621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3941751Z _lazy_init(state, module) 2022-11-23T02:48:20.3942109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3942256Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3942602Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3942737Z return func(*args, **kwargs) 2022-11-23T02:48:20.3943125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3943215Z p_assert( 2022-11-23T02:48:20.3943562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3943692Z traceback.print_stack() 2022-11-23T02:48:20.3943939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:48:20.3944253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:48:20.3944667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3944804Z File "", line 1, in 2022-11-23T02:48:20.3945022Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3945156Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3945366Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3945517Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3945733Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3945841Z self.run() 2022-11-23T02:48:20.3946045Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3946194Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3946548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3946667Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3947038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3947167Z getattr(self, test_name)() 2022-11-23T02:48:20.3947584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3947700Z fn() 2022-11-23T02:48:20.3948076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3948204Z test(self, **param_kwargs) 2022-11-23T02:48:20.3948570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3948682Z return func(*args, **kwargs) 2022-11-23T02:48:20.3948970Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3949086Z self.run_subtests( 2022-11-23T02:48:20.3949449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3949621Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3949994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3950153Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3950536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3950642Z output = model(*input) 2022-11-23T02:48:20.3950974Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3951126Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3951513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3951694Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3952077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3952205Z _lazy_init(state, module) 2022-11-23T02:48:20.3952568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3952698Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3953041Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3953168Z return func(*args, **kwargs) 2022-11-23T02:48:20.3953556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3953728Z p_assert( 2022-11-23T02:48:20.3954079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3954211Z traceback.print_stack() 2022-11-23T02:48:20.3954625Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:48:20.3954742Z File "", line 1, in 2022-11-23T02:48:20.3954959Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3955685Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3955903Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3956061Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3956282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3956397Z self.run() 2022-11-23T02:48:20.3956586Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3956735Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3957093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3957232Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3957677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3957819Z getattr(self, test_name)() 2022-11-23T02:48:20.3958194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3958297Z fn() 2022-11-23T02:48:20.3958648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3958780Z test(self, **param_kwargs) 2022-11-23T02:48:20.3959150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3959279Z return func(*args, **kwargs) 2022-11-23T02:48:20.3959567Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3959684Z self.run_subtests( 2022-11-23T02:48:20.3960049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3960219Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3960575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3960735Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3961123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3961253Z output = model(*input) 2022-11-23T02:48:20.3961588Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3961733Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3962117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3962302Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3962678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3962786Z _lazy_init(state, module) 2022-11-23T02:48:20.3963148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3963298Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3963643Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3963851Z return func(*args, **kwargs) 2022-11-23T02:48:20.3964247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3964357Z p_assert( 2022-11-23T02:48:20.3964680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3964815Z traceback.print_stack() 2022-11-23T02:48:20.3965069Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:48:20.3965321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:48:20.3965729Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3965863Z File "", line 1, in 2022-11-23T02:48:20.3966083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3966236Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3966427Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3966584Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3966802Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3966976Z self.run() 2022-11-23T02:48:20.3967198Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3967349Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3967702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3967840Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3968193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3968326Z getattr(self, test_name)() 2022-11-23T02:48:20.3968693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3968796Z fn() 2022-11-23T02:48:20.3969168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3969295Z test(self, **param_kwargs) 2022-11-23T02:48:20.3969663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3969792Z return func(*args, **kwargs) 2022-11-23T02:48:20.3970057Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3970173Z self.run_subtests( 2022-11-23T02:48:20.3970535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3970707Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3971083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3971243Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3971632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3971759Z output = model(*input) 2022-11-23T02:48:20.3972076Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3972224Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3972610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3972844Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3973227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3973440Z _lazy_init(state, module) 2022-11-23T02:48:20.3973809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3973958Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3974324Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3974457Z return func(*args, **kwargs) 2022-11-23T02:48:20.3974846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3974953Z p_assert( 2022-11-23T02:48:20.3975301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3975432Z traceback.print_stack() 2022-11-23T02:48:20.3975840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:48:20.3975980Z File "", line 1, in 2022-11-23T02:48:20.3976179Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3976328Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3976532Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3976740Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3976970Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3977081Z self.run() 2022-11-23T02:48:20.3977289Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3977442Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3977778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3977917Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3978294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3978422Z getattr(self, test_name)() 2022-11-23T02:48:20.3978792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3978895Z fn() 2022-11-23T02:48:20.3979266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3979377Z test(self, **param_kwargs) 2022-11-23T02:48:20.3979741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3979870Z return func(*args, **kwargs) 2022-11-23T02:48:20.3980155Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3980279Z self.run_subtests( 2022-11-23T02:48:20.3980641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3980807Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3981178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3981342Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3981713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3981839Z output = model(*input) 2022-11-23T02:48:20.3982176Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3982321Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3982705Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3982952Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3983332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3983458Z _lazy_init(state, module) 2022-11-23T02:48:20.3983804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3983954Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3984299Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3984431Z return func(*args, **kwargs) 2022-11-23T02:48:20.3984815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3984922Z p_assert( 2022-11-23T02:48:20.3985268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3985405Z traceback.print_stack() 2022-11-23T02:48:20.3985637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:48:20.3985887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:48:20.3986348Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3986495Z File "", line 1, in 2022-11-23T02:48:20.3986713Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3986862Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3987070Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3987225Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3987426Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3987538Z self.run() 2022-11-23T02:48:20.3987747Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3987898Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3988253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3988395Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3988771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3988882Z getattr(self, test_name)() 2022-11-23T02:48:20.3989251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3989351Z fn() 2022-11-23T02:48:20.3989723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.3989853Z test(self, **param_kwargs) 2022-11-23T02:48:20.3990219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.3990348Z return func(*args, **kwargs) 2022-11-23T02:48:20.3990634Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.3990737Z self.run_subtests( 2022-11-23T02:48:20.3991105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.3991271Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.3991649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.3991806Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.3992188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.3992377Z output = model(*input) 2022-11-23T02:48:20.3992720Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.3992846Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.3993235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.3993418Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.3993795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.3993918Z _lazy_init(state, module) 2022-11-23T02:48:20.3994277Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.3994422Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.3994768Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.3994884Z return func(*args, **kwargs) 2022-11-23T02:48:20.3995684Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.3995793Z p_assert( 2022-11-23T02:48:20.3996225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.3996367Z traceback.print_stack() 2022-11-23T02:48:20.3996782Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:48:20.3996917Z File "", line 1, in 2022-11-23T02:48:20.3997133Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.3997262Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.3997471Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.3997632Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.3997852Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.3997957Z self.run() 2022-11-23T02:48:20.3998162Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.3998316Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.3998668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.3998788Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.3999157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.3999285Z getattr(self, test_name)() 2022-11-23T02:48:20.3999651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.3999761Z fn() 2022-11-23T02:48:20.4000134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4000262Z test(self, **param_kwargs) 2022-11-23T02:48:20.4000628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4000742Z return func(*args, **kwargs) 2022-11-23T02:48:20.4001027Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4001145Z self.run_subtests( 2022-11-23T02:48:20.4001508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4001677Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4002049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4002289Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4002682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4002788Z output = model(*input) 2022-11-23T02:48:20.4003128Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4003275Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4003660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4003843Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4004217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4004342Z _lazy_init(state, module) 2022-11-23T02:48:20.4004701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4004839Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4005195Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4005324Z return func(*args, **kwargs) 2022-11-23T02:48:20.4005760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4005877Z p_assert( 2022-11-23T02:48:20.4006230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4006361Z traceback.print_stack() 2022-11-23T02:48:20.4006612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:48:20.4006844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:48:20.4007259Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.4007397Z File "", line 1, in 2022-11-23T02:48:20.4007616Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4007763Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4007976Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4008132Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4008351Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4008441Z self.run() 2022-11-23T02:48:20.4008646Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4008797Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4009149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4009294Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4009664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4009791Z getattr(self, test_name)() 2022-11-23T02:48:20.4010143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4010249Z fn() 2022-11-23T02:48:20.4010625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4010756Z test(self, **param_kwargs) 2022-11-23T02:48:20.4011120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4011249Z return func(*args, **kwargs) 2022-11-23T02:48:20.4011532Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4011712Z self.run_subtests( 2022-11-23T02:48:20.4012061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4012230Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4012607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4012771Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4013160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4013284Z output = model(*input) 2022-11-23T02:48:20.4013617Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4013763Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4014128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4014316Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4014693Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4014820Z _lazy_init(state, module) 2022-11-23T02:48:20.4015227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4015389Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4015739Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4015872Z return func(*args, **kwargs) 2022-11-23T02:48:20.4016261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4016350Z p_assert( 2022-11-23T02:48:20.4016701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4016832Z traceback.print_stack() 2022-11-23T02:48:20.4017243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:48:20.4017376Z File "", line 1, in 2022-11-23T02:48:20.4017595Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4017746Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4017937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4018096Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4018313Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4018419Z self.run() 2022-11-23T02:48:20.4018628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4018783Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4019133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4019273Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4019624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4019758Z getattr(self, test_name)() 2022-11-23T02:48:20.4020127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4020233Z fn() 2022-11-23T02:48:20.4020603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4020731Z test(self, **param_kwargs) 2022-11-23T02:48:20.4021089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4021289Z return func(*args, **kwargs) 2022-11-23T02:48:20.4021556Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4021677Z self.run_subtests( 2022-11-23T02:48:20.4022041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4022214Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4022587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4022747Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4023132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4023257Z output = model(*input) 2022-11-23T02:48:20.4023573Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4023725Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4024108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4024288Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4024715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4024852Z _lazy_init(state, module) 2022-11-23T02:48:20.4025214Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4025363Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4025688Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4025818Z return func(*args, **kwargs) 2022-11-23T02:48:20.4026210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4026317Z p_assert( 2022-11-23T02:48:20.4026663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4026795Z traceback.print_stack() 2022-11-23T02:48:20.4027050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:48:20.4027302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:48:20.4027691Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.4027826Z File "", line 1, in 2022-11-23T02:48:20.4028039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4028189Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4028407Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4028563Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4028785Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4028893Z self.run() 2022-11-23T02:48:20.4029087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4029238Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4029587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4029725Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4030095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4030223Z getattr(self, test_name)() 2022-11-23T02:48:20.4030589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4030757Z fn() 2022-11-23T02:48:20.4031114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4031242Z test(self, **param_kwargs) 2022-11-23T02:48:20.4031608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4031740Z return func(*args, **kwargs) 2022-11-23T02:48:20.4032024Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4032141Z self.run_subtests( 2022-11-23T02:48:20.4032504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4032673Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4033029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4033194Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4033579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4033706Z output = model(*input) 2022-11-23T02:48:20.4034089Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4034246Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4034634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4034816Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4035573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4035710Z _lazy_init(state, module) 2022-11-23T02:48:20.4036090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4036238Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4036582Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4036711Z return func(*args, **kwargs) 2022-11-23T02:48:20.4037104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4037214Z p_assert( 2022-11-23T02:48:20.4037545Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4037678Z traceback.print_stack() 2022-11-23T02:48:20.4038083Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:48:20.4038220Z File "", line 1, in 2022-11-23T02:48:20.4038442Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4038589Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4038795Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4038933Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4039157Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4039265Z self.run() 2022-11-23T02:48:20.4039473Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4039623Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4039974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4040109Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4040480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4040692Z getattr(self, test_name)() 2022-11-23T02:48:20.4041067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4041172Z fn() 2022-11-23T02:48:20.4041540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4041672Z test(self, **param_kwargs) 2022-11-23T02:48:20.4042033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4042164Z return func(*args, **kwargs) 2022-11-23T02:48:20.4042449Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4042548Z self.run_subtests( 2022-11-23T02:48:20.4042913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4043088Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4043465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4043628Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4044077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4044217Z output = model(*input) 2022-11-23T02:48:20.4044555Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4044681Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4045064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4045247Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4045632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4045756Z _lazy_init(state, module) 2022-11-23T02:48:20.4046114Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4046262Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4046612Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4046723Z return func(*args, **kwargs) 2022-11-23T02:48:20.4047110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4047218Z p_assert( 2022-11-23T02:48:20.4047563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4047695Z traceback.print_stack() 2022-11-23T02:48:20.4047950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:48:20.4048201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:48:20.4048610Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.4048731Z File "", line 1, in 2022-11-23T02:48:20.4048952Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4049100Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4049307Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4049461Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4049680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4049788Z self.run() 2022-11-23T02:48:20.4049995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4050194Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4050548Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4050684Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4051062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4051190Z getattr(self, test_name)() 2022-11-23T02:48:20.4051561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4051664Z fn() 2022-11-23T02:48:20.4052038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4052147Z test(self, **param_kwargs) 2022-11-23T02:48:20.4052505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4052639Z return func(*args, **kwargs) 2022-11-23T02:48:20.4052925Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4053044Z self.run_subtests( 2022-11-23T02:48:20.4053451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4053631Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4054006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4054145Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4054529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4054652Z output = model(*input) 2022-11-23T02:48:20.4054994Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4055139Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4055526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4055712Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4056090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4056198Z _lazy_init(state, module) 2022-11-23T02:48:20.4056557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4056707Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4057051Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4057186Z return func(*args, **kwargs) 2022-11-23T02:48:20.4057573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4057681Z p_assert( 2022-11-23T02:48:20.4058023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4058136Z traceback.print_stack() 2022-11-23T02:48:20.4058552Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:48:20.4058688Z File "", line 1, in 2022-11-23T02:48:20.4058907Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4059054Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4059263Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4059419Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4059704Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4059793Z self.run() 2022-11-23T02:48:20.4059999Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4060149Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4060502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4060642Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4061011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4061138Z getattr(self, test_name)() 2022-11-23T02:48:20.4061490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4061593Z fn() 2022-11-23T02:48:20.4061963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4062093Z test(self, **param_kwargs) 2022-11-23T02:48:20.4062454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4062585Z return func(*args, **kwargs) 2022-11-23T02:48:20.4062921Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4063050Z self.run_subtests( 2022-11-23T02:48:20.4063397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4063568Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4063941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4064099Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4064488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4064612Z output = model(*input) 2022-11-23T02:48:20.4064943Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4065091Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4065463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4065648Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4066025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4066150Z _lazy_init(state, module) 2022-11-23T02:48:20.4066511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4066662Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4067013Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4067144Z return func(*args, **kwargs) 2022-11-23T02:48:20.4067513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4067621Z p_assert( 2022-11-23T02:48:20.4067963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4068097Z traceback.print_stack() 2022-11-23T02:48:20.4068346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:48:20.4068592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:48:20.4068999Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.4069199Z File "", line 1, in 2022-11-23T02:48:20.4069398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4069546Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4069755Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4069913Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4070133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4070241Z self.run() 2022-11-23T02:48:20.4070448Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4070597Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4070932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4071070Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4071440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4071573Z getattr(self, test_name)() 2022-11-23T02:48:20.4071939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4072046Z fn() 2022-11-23T02:48:20.4072470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4072612Z test(self, **param_kwargs) 2022-11-23T02:48:20.4072965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4073095Z return func(*args, **kwargs) 2022-11-23T02:48:20.4073383Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4073500Z self.run_subtests( 2022-11-23T02:48:20.4073859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4074030Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4074455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4074611Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4074984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4075492Z output = model(*input) 2022-11-23T02:48:20.4075837Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4075983Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4076366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4076550Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4076931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4077058Z _lazy_init(state, module) 2022-11-23T02:48:20.4077401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4077554Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4077905Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4078033Z return func(*args, **kwargs) 2022-11-23T02:48:20.4078421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4078530Z p_assert( 2022-11-23T02:48:20.4078874Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4079100Z traceback.print_stack() 2022-11-23T02:48:20.4079495Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:48:20.4079630Z File "", line 1, in 2022-11-23T02:48:20.4079845Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4079998Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4080207Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4080361Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4080580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4080686Z self.run() 2022-11-23T02:48:20.4080876Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4081025Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4081373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4081515Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4081888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4082015Z getattr(self, test_name)() 2022-11-23T02:48:20.4082446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4082543Z fn() 2022-11-23T02:48:20.4082926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4083058Z test(self, **param_kwargs) 2022-11-23T02:48:20.4083424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4083554Z return func(*args, **kwargs) 2022-11-23T02:48:20.4083841Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4083965Z self.run_subtests( 2022-11-23T02:48:20.4084323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4084472Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4084845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4085003Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4085384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4085507Z output = model(*input) 2022-11-23T02:48:20.4085842Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4085988Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4086379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4086564Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4086926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4087056Z _lazy_init(state, module) 2022-11-23T02:48:20.4087419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4087568Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4087913Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4088045Z return func(*args, **kwargs) 2022-11-23T02:48:20.4088433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4088607Z p_assert( 2022-11-23T02:48:20.4088937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4089068Z traceback.print_stack() 2022-11-23T02:48:20.4089321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:48:20.4089570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:48:20.4089984Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.4090120Z File "", line 1, in 2022-11-23T02:48:20.4090335Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4090484Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4090675Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4090837Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4091053Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4091160Z self.run() 2022-11-23T02:48:20.4091367Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4091515Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4091920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4092049Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4092424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4092554Z getattr(self, test_name)() 2022-11-23T02:48:20.4092924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4093026Z fn() 2022-11-23T02:48:20.4093407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4093535Z test(self, **param_kwargs) 2022-11-23T02:48:20.4093898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4094009Z return func(*args, **kwargs) 2022-11-23T02:48:20.4094293Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4094411Z self.run_subtests( 2022-11-23T02:48:20.4094769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4094937Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4095309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4095467Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4095858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4095965Z output = model(*input) 2022-11-23T02:48:20.4096299Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4096448Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4096835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4097017Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4097392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4097519Z _lazy_init(state, module) 2022-11-23T02:48:20.4097876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4098070Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4098423Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4098555Z return func(*args, **kwargs) 2022-11-23T02:48:20.4098945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4099056Z p_assert( 2022-11-23T02:48:20.4099402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4099534Z traceback.print_stack() 2022-11-23T02:48:20.4099942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:48:20.4100057Z File "", line 1, in 2022-11-23T02:48:20.4100272Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4100423Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4100629Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4100782Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4101002Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4101111Z self.run() 2022-11-23T02:48:20.4101369Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4101513Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4101865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4102004Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4102373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4102499Z getattr(self, test_name)() 2022-11-23T02:48:20.4102872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4102973Z fn() 2022-11-23T02:48:20.4103344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4103453Z test(self, **param_kwargs) 2022-11-23T02:48:20.4103820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4103949Z return func(*args, **kwargs) 2022-11-23T02:48:20.4104237Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4104354Z self.run_subtests( 2022-11-23T02:48:20.4104717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4104884Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4105262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4105402Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4105788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4105919Z output = model(*input) 2022-11-23T02:48:20.4106251Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4106396Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4106783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4106963Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4107342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4107526Z _lazy_init(state, module) 2022-11-23T02:48:20.4107892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4108037Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4108381Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4108516Z return func(*args, **kwargs) 2022-11-23T02:48:20.4108905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4109011Z p_assert( 2022-11-23T02:48:20.4109362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4109475Z traceback.print_stack() 2022-11-23T02:48:20.4109724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:48:20.4109972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:48:20.4110382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.4110516Z File "", line 1, in 2022-11-23T02:48:20.4110783Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4110942Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4111152Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4111288Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4111505Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4111615Z self.run() 2022-11-23T02:48:20.4111822Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4111971Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4112329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4112469Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4112822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4112950Z getattr(self, test_name)() 2022-11-23T02:48:20.4113325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4113428Z fn() 2022-11-23T02:48:20.4113802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4113930Z test(self, **param_kwargs) 2022-11-23T02:48:20.4114299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4114427Z return func(*args, **kwargs) 2022-11-23T02:48:20.4114697Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4114813Z self.run_subtests( 2022-11-23T02:48:20.4115635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4115820Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4116204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4116364Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4116749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4116872Z output = model(*input) 2022-11-23T02:48:20.4117189Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4117469Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4117863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4118045Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4118426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4118553Z _lazy_init(state, module) 2022-11-23T02:48:20.4118913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4119062Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4119412Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4119524Z return func(*args, **kwargs) 2022-11-23T02:48:20.4119912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4120024Z p_assert( 2022-11-23T02:48:20.4120370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4120501Z traceback.print_stack() 2022-11-23T02:48:20.4120974Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:48:20.4121123Z File "", line 1, in 2022-11-23T02:48:20.4121321Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4121468Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4121674Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4121830Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4122047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4122159Z self.run() 2022-11-23T02:48:20.4122365Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4122514Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4122852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4122993Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4123364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4123494Z getattr(self, test_name)() 2022-11-23T02:48:20.4123862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4123963Z fn() 2022-11-23T02:48:20.4124336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4124465Z test(self, **param_kwargs) 2022-11-23T02:48:20.4124816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4124949Z return func(*args, **kwargs) 2022-11-23T02:48:20.4125233Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4125351Z self.run_subtests( 2022-11-23T02:48:20.4125717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4125891Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4126267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4126427Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4126792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4126985Z output = model(*input) 2022-11-23T02:48:20.4127327Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4127474Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4127859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4128048Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4128427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4128555Z _lazy_init(state, module) 2022-11-23T02:48:20.4128898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4129045Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4129391Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4129524Z return func(*args, **kwargs) 2022-11-23T02:48:20.4129912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4130020Z p_assert( 2022-11-23T02:48:20.4130428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4130573Z traceback.print_stack() 2022-11-23T02:48:20.4130809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:48:20.4131054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:48:20.4131467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.4131601Z File "", line 1, in 2022-11-23T02:48:20.4131818Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4131971Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4132180Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4132336Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4132538Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4132649Z self.run() 2022-11-23T02:48:20.4132855Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4133004Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4133354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4133493Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4133862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4133995Z getattr(self, test_name)() 2022-11-23T02:48:20.4134348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4134452Z fn() 2022-11-23T02:48:20.4134823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4134952Z test(self, **param_kwargs) 2022-11-23T02:48:20.4135317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4135445Z return func(*args, **kwargs) 2022-11-23T02:48:20.4135727Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4135825Z self.run_subtests( 2022-11-23T02:48:20.4136189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4136419Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4136797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4136956Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4137343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4137470Z output = model(*input) 2022-11-23T02:48:20.4137804Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4137948Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4138311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4138493Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4138869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4138998Z _lazy_init(state, module) 2022-11-23T02:48:20.4139360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4139508Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4139906Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4140047Z return func(*args, **kwargs) 2022-11-23T02:48:20.4140424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4140534Z p_assert( 2022-11-23T02:48:20.4140879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4141011Z traceback.print_stack() 2022-11-23T02:48:20.4141422Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:48:20.4141564Z File "", line 1, in 2022-11-23T02:48:20.4141780Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4141908Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4142121Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4142276Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4142493Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4142599Z self.run() 2022-11-23T02:48:20.4142806Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4142960Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4143311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4143434Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4143804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4143934Z getattr(self, test_name)() 2022-11-23T02:48:20.4144298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4144399Z fn() 2022-11-23T02:48:20.4144776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4144905Z test(self, **param_kwargs) 2022-11-23T02:48:20.4145273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4145384Z return func(*args, **kwargs) 2022-11-23T02:48:20.4145669Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4145850Z self.run_subtests( 2022-11-23T02:48:20.4146218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4146387Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4146758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4146921Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4147304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4147411Z output = model(*input) 2022-11-23T02:48:20.4147743Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4147890Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4148272Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4148460Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4148835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4148961Z _lazy_init(state, module) 2022-11-23T02:48:20.4149369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4149508Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4149860Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4149989Z return func(*args, **kwargs) 2022-11-23T02:48:20.4150378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4150485Z p_assert( 2022-11-23T02:48:20.4150830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4150967Z traceback.print_stack() 2022-11-23T02:48:20.4151220Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:48:20.4151448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:48:20.4151866Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.4152004Z File "", line 1, in 2022-11-23T02:48:20.4152223Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4152371Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4152580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4152736Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4152951Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4153043Z self.run() 2022-11-23T02:48:20.4153248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4153395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4153743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4153884Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4154255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4154386Z getattr(self, test_name)() 2022-11-23T02:48:20.4154752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4154836Z fn() 2022-11-23T02:48:20.4155614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4155840Z test(self, **param_kwargs) 2022-11-23T02:48:20.4156217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4156348Z return func(*args, **kwargs) 2022-11-23T02:48:20.4156636Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4156757Z self.run_subtests( 2022-11-23T02:48:20.4157120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4157270Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4157641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4157797Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4158178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4158309Z output = model(*input) 2022-11-23T02:48:20.4158647Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4158795Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4159246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4159423Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4159803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4159930Z _lazy_init(state, module) 2022-11-23T02:48:20.4160293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4160441Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4160795Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4160925Z return func(*args, **kwargs) 2022-11-23T02:48:20.4161314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4161404Z p_assert( 2022-11-23T02:48:20.4161751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4161881Z traceback.print_stack() 2022-11-23T02:48:20.4162289Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:48:20.4162423Z File "", line 1, in 2022-11-23T02:48:20.4162639Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4162787Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4162995Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4163139Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4163358Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4163466Z self.run() 2022-11-23T02:48:20.4163674Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4163829Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4164183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4164325Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4164674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4164801Z getattr(self, test_name)() 2022-11-23T02:48:20.4165169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4165340Z fn() 2022-11-23T02:48:20.4165721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4165851Z test(self, **param_kwargs) 2022-11-23T02:48:20.4166216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4166348Z return func(*args, **kwargs) 2022-11-23T02:48:20.4166615Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:48:20.4166734Z self.run_subtests( 2022-11-23T02:48:20.4167100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4167270Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4167645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4167814Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4168203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4168328Z output = model(*input) 2022-11-23T02:48:20.4168689Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4168849Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4169240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4169424Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4169799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4169924Z _lazy_init(state, module) 2022-11-23T02:48:20.4170292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.4170439Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4170764Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4170894Z return func(*args, **kwargs) 2022-11-23T02:48:20.4171285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4171393Z p_assert( 2022-11-23T02:48:20.4171740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4171873Z traceback.print_stack() 2022-11-23T02:48:20.4172127Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:48:20.4172370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:48:20.4172768Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.4173175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:48:20.4173428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:48:20.4173669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:48:20.4174073Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.4174522Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:48:20.4174767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:48:20.4175089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:48:20.4175495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.4175895Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:48:20.4176126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:48:20.4176366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:48:20.4176767Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.4177164Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:48:20.4177931Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.4178235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:48:20.4178491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:48:20.4178895Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.4179294Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:48:20.4180055Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.4180314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:48:20.4180539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:48:20.4180947Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.4181350Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:48:20.4181596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:48:20.4181837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:48:20.4182244Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.4182644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:48:20.4182890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:48:20.4183128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:48:20.4183510Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.4183913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:48:20.4184159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:48:20.4184465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:48:20.4184871Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.4185269Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:48:20.4185515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:48:20.4185752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:48:20.4186152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.4186554Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:48:20.4186781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:48:20.4187019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:48:20.4187419Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.4187867Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:48:20.4188639Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.4188892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:48:20.4189139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:48:20.4189539Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.4189942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:48:20.4190195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:48:20.4190421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:48:20.4190823Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.4191220Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:48:20.4191466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:48:20.4191711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:48:20.4192113Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.4192515Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:48:20.4192762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:48:20.4193001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:48:20.4193381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.4193779Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:48:20.4194086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:48:20.4194329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:48:20.4194733Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.4195441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:48:20.4195705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:48:20.4195946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:48:20.4196354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.4196757Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:48:20.4197572Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.4197838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:48:20.4198083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:48:20.4198488Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.4198883Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:48:20.4199136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:48:20.4199377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:48:20.4199784Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.4200548Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.4200956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:48:20.4201206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:48:20.4201430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:48:20.4201836Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.4202240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:48:20.4202487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:48:20.4202728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:48:20.4203126Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.4203525Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:48:20.4203850Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:48:20.4204091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:48:20.4204478Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.4204875Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:48:20.4205123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:48:20.4205360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:48:20.4205758Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.4206158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:48:20.4206965Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.4207228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:48:20.4207471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:48:20.4207876Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.4208275Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:48:20.4208380Z dist init r=1, world=2 2022-11-23T02:48:20.4208715Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4209044Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4209362Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4209677Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4209988Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4210304Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4210610Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4210922Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4211229Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4211535Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4211886Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4212193Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4212311Z dist init r=0, world=2 2022-11-23T02:48:20.4212646Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4212969Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4213282Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4213596Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4213904Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4214256Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4214576Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4214879Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4215166Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4215478Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4215787Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4216094Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4216202Z ok (30.552s) 2022-11-23T02:48:20.4216565Z test_nested_always_wrap_model_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89813 2022-11-23T02:48:20.4216791Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89814 2022-11-23T02:48:20.4217188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4217372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4217765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4217951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4218330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4218513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4218903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4219099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4219414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.4219664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.4220076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4220487Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4220705Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.4220940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.4221181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4221418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4222503Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4222635Z warnings.warn( 2022-11-23T02:48:20.4223663Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4223783Z warnings.warn( 2022-11-23T02:48:20.4224025Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4224263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4224503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4224725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4224958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4225188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4225422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4225653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4225884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4226121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4226354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4226566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4226799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4227028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4227258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4227489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4227719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4228012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4228245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4228455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4228695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4228921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4229149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4229377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4229608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4229835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4230068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4230294Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4230505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4230780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4231023Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4231253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4231482Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4231710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4231937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4232173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4232384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4232611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4232841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4233070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4233298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4233528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4233756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4233984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4234200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4234431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4234548Z dist init r=1, world=2 2022-11-23T02:48:20.4234661Z dist init r=0, world=2 2022-11-23T02:48:20.4234770Z ok (5.213s) 2022-11-23T02:48:20.4235304Z test_nested_always_wrap_model_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89896 2022-11-23T02:48:20.4235539Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89897 2022-11-23T02:48:20.4235938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4236103Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4236603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4236803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4237177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4237360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4237741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4237936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4238186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.4238438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.4238829Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4239244Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4239481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.4239777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.4240031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4240268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4241301Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4241425Z warnings.warn( 2022-11-23T02:48:20.4242458Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4242575Z warnings.warn( 2022-11-23T02:48:20.4242810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4243030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4243271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4243509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4243739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4243974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4244205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4244436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4244664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4244893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4245107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4245402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4245633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4245863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4246098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4246329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4246560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4246787Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4246998Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4247227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4247460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4247689Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4247916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4248192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4248435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4248661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4248868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4249098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4249323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4249560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4249792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4250019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4250245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4250474Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4250701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4250913Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4251140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4251372Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4251600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4251829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4252060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4252290Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4252517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4252728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4252958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4253187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4253367Z dist init r=0, world=2 2022-11-23T02:48:20.4253483Z dist init r=1, world=2 2022-11-23T02:48:20.4253591Z ok (5.713s) 2022-11-23T02:48:20.4253966Z test_nested_always_wrap_model_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89979 2022-11-23T02:48:20.4254191Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89980 2022-11-23T02:48:20.4254567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4254750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4255138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4255334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4255710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4255893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4256279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4256577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4256841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.4257072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.4257489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4257897Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4258138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.4258372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.4258611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4258851Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4259877Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4259998Z warnings.warn( 2022-11-23T02:48:20.4261029Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4261146Z warnings.warn( 2022-11-23T02:48:20.4261367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4261599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4261835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4262072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4262306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4262600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4262832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4263064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4263280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4263513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4263745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4263975Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4264206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4264448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4264681Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4264911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4265169Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4265409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4265640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4265867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4266095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4266322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4266556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4266783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4266993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4267226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4267455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4267682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4267910Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4268137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4268363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4268596Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4268808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4269035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4269269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4269498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4269725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4269952Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4270179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4270406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4270698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4270907Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4271132Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4271362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4271594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4271825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4271942Z dist init r=0, world=2 2022-11-23T02:48:20.4272056Z dist init r=1, world=2 2022-11-23T02:48:20.4272159Z ok (5.513s) 2022-11-23T02:48:20.4272505Z test_nested_always_wrap_model_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90062 2022-11-23T02:48:20.4272734Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90063 2022-11-23T02:48:20.4273183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4273415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4273815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4274010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4274416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4274595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4274967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4275356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4275610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.4275858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.4276273Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4276678Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4276912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.4277143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.4277382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4277604Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4278635Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4278755Z warnings.warn( 2022-11-23T02:48:20.4279774Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4279980Z warnings.warn( 2022-11-23T02:48:20.4280113Z File "", line 1, in 2022-11-23T02:48:20.4280331Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4280482Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4280694Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4280850Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4281052Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4281158Z self.run() 2022-11-23T02:48:20.4281364Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4281516Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4281876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4282013Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4282385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4282516Z getattr(self, test_name)() 2022-11-23T02:48:20.4282933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4283046Z fn() 2022-11-23T02:48:20.4283421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4283549Z test(self, **param_kwargs) 2022-11-23T02:48:20.4283916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4284047Z return func(*args, **kwargs) 2022-11-23T02:48:20.4284310Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4284430Z self.run_subtests( 2022-11-23T02:48:20.4284780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4284947Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4285329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4285488Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4285873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4285998Z output = model(*input) 2022-11-23T02:48:20.4286333Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4286480Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4286854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4287040Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4287414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4287546Z _lazy_init(state, module) 2022-11-23T02:48:20.4287906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4288053Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4288399Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4288529Z return func(*args, **kwargs) 2022-11-23T02:48:20.4288900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4289074Z p_assert( 2022-11-23T02:48:20.4289425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4289556Z traceback.print_stack() 2022-11-23T02:48:20.4289688Z File "", line 1, in 2022-11-23T02:48:20.4289904Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4290053Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4290262Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4290396Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4290613Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4290721Z self.run() 2022-11-23T02:48:20.4290929Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4291080Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4291434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4291573Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4291925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4292056Z getattr(self, test_name)() 2022-11-23T02:48:20.4292473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4292582Z fn() 2022-11-23T02:48:20.4292960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4293087Z test(self, **param_kwargs) 2022-11-23T02:48:20.4293455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4293580Z return func(*args, **kwargs) 2022-11-23T02:48:20.4293831Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4293949Z self.run_subtests( 2022-11-23T02:48:20.4294312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4294479Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4294857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4295016Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4295400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4295526Z output = model(*input) 2022-11-23T02:48:20.4295843Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4295989Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4296387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4296569Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4296944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4297074Z _lazy_init(state, module) 2022-11-23T02:48:20.4297434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4297581Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4297910Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4298036Z return func(*args, **kwargs) 2022-11-23T02:48:20.4298423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4298605Z p_assert( 2022-11-23T02:48:20.4298951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4299080Z traceback.print_stack() 2022-11-23T02:48:20.4299320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4299563Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4299679Z File "", line 1, in 2022-11-23T02:48:20.4299894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4300040Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4300246Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4300402Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4300618Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4300731Z self.run() 2022-11-23T02:48:20.4300938Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4301070Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4301423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4301612Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4301996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4302124Z getattr(self, test_name)() 2022-11-23T02:48:20.4302492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4302594Z fn() 2022-11-23T02:48:20.4302965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4303080Z test(self, **param_kwargs) 2022-11-23T02:48:20.4303446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4303576Z return func(*args, **kwargs) 2022-11-23T02:48:20.4303836Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4303959Z self.run_subtests( 2022-11-23T02:48:20.4304323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4304491Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4304869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4305010Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4305397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4305527Z output = model(*input) 2022-11-23T02:48:20.4305863Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4306010Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4306400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4306584Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4306963Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4307071Z _lazy_init(state, module) 2022-11-23T02:48:20.4307434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4307581Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4307927Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4308117Z return func(*args, **kwargs) 2022-11-23T02:48:20.4308507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4308617Z p_assert( 2022-11-23T02:48:20.4308966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4309080Z traceback.print_stack() 2022-11-23T02:48:20.4309213Z File "", line 1, in 2022-11-23T02:48:20.4309428Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4309572Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4309780Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4309937Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4310155Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4310249Z self.run() 2022-11-23T02:48:20.4310458Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4310606Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4310958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4311144Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4311528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4311654Z getattr(self, test_name)() 2022-11-23T02:48:20.4312021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4312104Z fn() 2022-11-23T02:48:20.4312476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4312610Z test(self, **param_kwargs) 2022-11-23T02:48:20.4312979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4313107Z return func(*args, **kwargs) 2022-11-23T02:48:20.4313366Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4313486Z self.run_subtests( 2022-11-23T02:48:20.4313848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4313999Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4314370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4314528Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4314913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4315270Z output = model(*input) 2022-11-23T02:48:20.4315622Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4315769Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4316164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4316331Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4316711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4316836Z _lazy_init(state, module) 2022-11-23T02:48:20.4317197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4317345Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4317689Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4317912Z return func(*args, **kwargs) 2022-11-23T02:48:20.4318303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4318390Z p_assert( 2022-11-23T02:48:20.4318738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4318871Z traceback.print_stack() 2022-11-23T02:48:20.4319112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4319350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4319483Z File "", line 1, in 2022-11-23T02:48:20.4319700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4319847Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4320041Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4320197Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4320415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4320523Z self.run() 2022-11-23T02:48:20.4320793Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4320957Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4321309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4321426Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4321798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4321925Z getattr(self, test_name)() 2022-11-23T02:48:20.4322296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4322404Z fn() 2022-11-23T02:48:20.4322779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4322908Z test(self, **param_kwargs) 2022-11-23T02:48:20.4323278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4323392Z return func(*args, **kwargs) 2022-11-23T02:48:20.4323654Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4323772Z self.run_subtests( 2022-11-23T02:48:20.4324136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4324304Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4324672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4324835Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4325224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4325329Z output = model(*input) 2022-11-23T02:48:20.4325672Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4325820Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4326207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4326384Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4326761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4326885Z _lazy_init(state, module) 2022-11-23T02:48:20.4327309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4327440Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4327788Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4327917Z return func(*args, **kwargs) 2022-11-23T02:48:20.4328307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4328415Z p_assert( 2022-11-23T02:48:20.4328759Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4328892Z traceback.print_stack() 2022-11-23T02:48:20.4329025Z File "", line 1, in 2022-11-23T02:48:20.4329223Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4329367Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4329581Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4329737Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4329956Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4330062Z self.run() 2022-11-23T02:48:20.4330325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4330470Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4330820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4330955Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4331323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4331448Z getattr(self, test_name)() 2022-11-23T02:48:20.4331814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4331924Z fn() 2022-11-23T02:48:20.4332300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4332409Z test(self, **param_kwargs) 2022-11-23T02:48:20.4332775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4332906Z return func(*args, **kwargs) 2022-11-23T02:48:20.4333171Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4333288Z self.run_subtests( 2022-11-23T02:48:20.4333653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4333820Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4334193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4334339Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4334727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4334850Z output = model(*input) 2022-11-23T02:48:20.4335187Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4335335Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4335722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4335902Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4336276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4336467Z _lazy_init(state, module) 2022-11-23T02:48:20.4336814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4336959Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4337303Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4337432Z return func(*args, **kwargs) 2022-11-23T02:48:20.4337821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4337930Z p_assert( 2022-11-23T02:48:20.4338274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4338386Z traceback.print_stack() 2022-11-23T02:48:20.4338629Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4338866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4339004Z File "", line 1, in 2022-11-23T02:48:20.4339221Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4339368Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4339574Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4339778Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4339989Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4340098Z self.run() 2022-11-23T02:48:20.4340311Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4340462Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4340815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4340952Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4341329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4341456Z getattr(self, test_name)() 2022-11-23T02:48:20.4341806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4341909Z fn() 2022-11-23T02:48:20.4342283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4342411Z test(self, **param_kwargs) 2022-11-23T02:48:20.4342778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4342906Z return func(*args, **kwargs) 2022-11-23T02:48:20.4343169Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4343286Z self.run_subtests( 2022-11-23T02:48:20.4343637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4343807Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4344177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4344334Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4344722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4344851Z output = model(*input) 2022-11-23T02:48:20.4345186Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4345330Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4345695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4345944Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4346321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4346445Z _lazy_init(state, module) 2022-11-23T02:48:20.4346799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4346951Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4347294Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4347424Z return func(*args, **kwargs) 2022-11-23T02:48:20.4347794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4347899Z p_assert( 2022-11-23T02:48:20.4348242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4348374Z traceback.print_stack() 2022-11-23T02:48:20.4348507Z File "", line 1, in 2022-11-23T02:48:20.4348720Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4348865Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4349054Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4349258Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4349487Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4349593Z self.run() 2022-11-23T02:48:20.4349799Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4349949Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4350295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4350433Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4350794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4350929Z getattr(self, test_name)() 2022-11-23T02:48:20.4351297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4351401Z fn() 2022-11-23T02:48:20.4351779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4351906Z test(self, **param_kwargs) 2022-11-23T02:48:20.4352269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4352397Z return func(*args, **kwargs) 2022-11-23T02:48:20.4352641Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4352759Z self.run_subtests( 2022-11-23T02:48:20.4353130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4353301Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4353672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4353834Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4354218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4354341Z output = model(*input) 2022-11-23T02:48:20.4354659Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4354804Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4355368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4355646Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4356031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4356156Z _lazy_init(state, module) 2022-11-23T02:48:20.4356521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4356671Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4357002Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4357130Z return func(*args, **kwargs) 2022-11-23T02:48:20.4357517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4357625Z p_assert( 2022-11-23T02:48:20.4357972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4358109Z traceback.print_stack() 2022-11-23T02:48:20.4358352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4358595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4358709Z File "", line 1, in 2022-11-23T02:48:20.4358992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4359149Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4359354Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4359508Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4359725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4359834Z self.run() 2022-11-23T02:48:20.4360022Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4360180Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4360536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4360672Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4361038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4361169Z getattr(self, test_name)() 2022-11-23T02:48:20.4361541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4361645Z fn() 2022-11-23T02:48:20.4362004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4362130Z test(self, **param_kwargs) 2022-11-23T02:48:20.4362495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4362628Z return func(*args, **kwargs) 2022-11-23T02:48:20.4362889Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4363008Z self.run_subtests( 2022-11-23T02:48:20.4363370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4363541Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4363896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4364057Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4364441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4364566Z output = model(*input) 2022-11-23T02:48:20.4364899Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4365123Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4365510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4365688Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4366050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4366176Z _lazy_init(state, module) 2022-11-23T02:48:20.4366537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4366685Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4367028Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4367159Z return func(*args, **kwargs) 2022-11-23T02:48:20.4367549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4367660Z p_assert( 2022-11-23T02:48:20.4367990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4368122Z traceback.print_stack() 2022-11-23T02:48:20.4368254Z File "", line 1, in 2022-11-23T02:48:20.4368518Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4368676Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4368884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4369038Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4369253Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4369341Z self.run() 2022-11-23T02:48:20.4369548Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4369707Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4370059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4370196Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4370572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4370702Z getattr(self, test_name)() 2022-11-23T02:48:20.4371070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4371155Z fn() 2022-11-23T02:48:20.4371531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4371662Z test(self, **param_kwargs) 2022-11-23T02:48:20.4372029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4372165Z return func(*args, **kwargs) 2022-11-23T02:48:20.4372426Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4372544Z self.run_subtests( 2022-11-23T02:48:20.4372888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4373060Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4373434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4373588Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4373969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4374093Z output = model(*input) 2022-11-23T02:48:20.4374477Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4374687Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4375079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4375245Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4375626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4375752Z _lazy_init(state, module) 2022-11-23T02:48:20.4376109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4376256Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4376601Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4376730Z return func(*args, **kwargs) 2022-11-23T02:48:20.4377123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4377213Z p_assert( 2022-11-23T02:48:20.4377562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4377693Z traceback.print_stack() 2022-11-23T02:48:20.4377986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4378236Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4378369Z File "", line 1, in 2022-11-23T02:48:20.4378584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4378710Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4378916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4379072Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4379295Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4379403Z self.run() 2022-11-23T02:48:20.4379609Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4379759Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4380117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4380237Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4380607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4380734Z getattr(self, test_name)() 2022-11-23T02:48:20.4381101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4381203Z fn() 2022-11-23T02:48:20.4381576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4381709Z test(self, **param_kwargs) 2022-11-23T02:48:20.4382075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4382186Z return func(*args, **kwargs) 2022-11-23T02:48:20.4382452Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4382574Z self.run_subtests( 2022-11-23T02:48:20.4382936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4383105Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4383477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4383634Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4384019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4384193Z output = model(*input) 2022-11-23T02:48:20.4384530Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4384674Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4385063Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4385255Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4385627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4385754Z _lazy_init(state, module) 2022-11-23T02:48:20.4386118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4386248Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4386598Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4386726Z return func(*args, **kwargs) 2022-11-23T02:48:20.4387111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4387218Z p_assert( 2022-11-23T02:48:20.4387611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4387750Z traceback.print_stack() 2022-11-23T02:48:20.4387883Z File "", line 1, in 2022-11-23T02:48:20.4388081Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4388227Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4388435Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4388590Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4388815Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4388923Z self.run() 2022-11-23T02:48:20.4389130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4389261Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4389616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4389753Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4390119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4390245Z getattr(self, test_name)() 2022-11-23T02:48:20.4390610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4390712Z fn() 2022-11-23T02:48:20.4391084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4391200Z test(self, **param_kwargs) 2022-11-23T02:48:20.4391568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4391698Z return func(*args, **kwargs) 2022-11-23T02:48:20.4391962Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4392080Z self.run_subtests( 2022-11-23T02:48:20.4392445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4392611Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4392985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4393128Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4393582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4393705Z output = model(*input) 2022-11-23T02:48:20.4394036Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4394181Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4394568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4394752Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4395299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4395414Z _lazy_init(state, module) 2022-11-23T02:48:20.4395779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4395929Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4396275Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4396405Z return func(*args, **kwargs) 2022-11-23T02:48:20.4396791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4396971Z p_assert( 2022-11-23T02:48:20.4397335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4397446Z traceback.print_stack() 2022-11-23T02:48:20.4397687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4397926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4398063Z File "", line 1, in 2022-11-23T02:48:20.4398279Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4398430Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4398640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4398797Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4398995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4399108Z self.run() 2022-11-23T02:48:20.4399315Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4399465Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4399813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4399951Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4400322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4400432Z getattr(self, test_name)() 2022-11-23T02:48:20.4400805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4400907Z fn() 2022-11-23T02:48:20.4401278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4401405Z test(self, **param_kwargs) 2022-11-23T02:48:20.4401772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4401902Z return func(*args, **kwargs) 2022-11-23T02:48:20.4402165Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4402263Z self.run_subtests( 2022-11-23T02:48:20.4402625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4402793Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4403299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4403458Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4403842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4403970Z output = model(*input) 2022-11-23T02:48:20.4404305Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4404433Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4404818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4405002Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4405378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4405508Z _lazy_init(state, module) 2022-11-23T02:48:20.4405868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4406014Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4406407Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4406547Z return func(*args, **kwargs) 2022-11-23T02:48:20.4406920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4407025Z p_assert( 2022-11-23T02:48:20.4407370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4407500Z traceback.print_stack() 2022-11-23T02:48:20.4407634Z File "", line 1, in 2022-11-23T02:48:20.4407850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4408003Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4408196Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4408354Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4408572Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4408684Z self.run() 2022-11-23T02:48:20.4408894Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4409044Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4409394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4409531Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4409886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4410021Z getattr(self, test_name)() 2022-11-23T02:48:20.4410386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4410489Z fn() 2022-11-23T02:48:20.4410861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4410987Z test(self, **param_kwargs) 2022-11-23T02:48:20.4411361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4411493Z return func(*args, **kwargs) 2022-11-23T02:48:20.4411735Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4411853Z self.run_subtests( 2022-11-23T02:48:20.4412214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4412380Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4412822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4412978Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4413365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4413492Z output = model(*input) 2022-11-23T02:48:20.4413812Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4413956Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4414338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4414518Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4414894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4415021Z _lazy_init(state, module) 2022-11-23T02:48:20.4415383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4415531Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4415904Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4416043Z return func(*args, **kwargs) 2022-11-23T02:48:20.4416433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4416537Z p_assert( 2022-11-23T02:48:20.4416877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4417007Z traceback.print_stack() 2022-11-23T02:48:20.4417249Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4417499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4417616Z File "", line 1, in 2022-11-23T02:48:20.4417833Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4417981Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4418189Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4418348Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4418567Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4418673Z self.run() 2022-11-23T02:48:20.4418862Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4419008Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4419356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4419494Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4419865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4419994Z getattr(self, test_name)() 2022-11-23T02:48:20.4420360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4420468Z fn() 2022-11-23T02:48:20.4420824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4420947Z test(self, **param_kwargs) 2022-11-23T02:48:20.4421312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4421441Z return func(*args, **kwargs) 2022-11-23T02:48:20.4421702Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4421881Z self.run_subtests( 2022-11-23T02:48:20.4422244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4422410Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4422765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4422922Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4423307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4423431Z output = model(*input) 2022-11-23T02:48:20.4423770Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4423921Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4424300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4424487Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4424843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4424969Z _lazy_init(state, module) 2022-11-23T02:48:20.4425381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4425535Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4425884Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4426012Z return func(*args, **kwargs) 2022-11-23T02:48:20.4426399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4426506Z p_assert( 2022-11-23T02:48:20.4426833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4426969Z traceback.print_stack() 2022-11-23T02:48:20.4427102Z File "", line 1, in 2022-11-23T02:48:20.4427315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4427460Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4427671Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4427826Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4428042Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4428131Z self.run() 2022-11-23T02:48:20.4428337Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4428487Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4428838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4428977Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4429350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4429475Z getattr(self, test_name)() 2022-11-23T02:48:20.4429823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4429927Z fn() 2022-11-23T02:48:20.4430300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4430430Z test(self, **param_kwargs) 2022-11-23T02:48:20.4430797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4430926Z return func(*args, **kwargs) 2022-11-23T02:48:20.4431188Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4431379Z self.run_subtests( 2022-11-23T02:48:20.4431726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4431893Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4432270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4432433Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4432819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4432942Z output = model(*input) 2022-11-23T02:48:20.4433273Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4433420Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4433788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4433976Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4434355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4434481Z _lazy_init(state, module) 2022-11-23T02:48:20.4434891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4435250Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4435614Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4435743Z return func(*args, **kwargs) 2022-11-23T02:48:20.4436110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4436217Z p_assert( 2022-11-23T02:48:20.4436570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4436701Z traceback.print_stack() 2022-11-23T02:48:20.4436945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4437184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4437323Z File "", line 1, in 2022-11-23T02:48:20.4437543Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4437671Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4437877Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4438031Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4438249Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4438356Z self.run() 2022-11-23T02:48:20.4438568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4438721Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4439071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4439190Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4439563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4439692Z getattr(self, test_name)() 2022-11-23T02:48:20.4440059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4448562Z fn() 2022-11-23T02:48:20.4449054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4449188Z test(self, **param_kwargs) 2022-11-23T02:48:20.4449569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4449838Z return func(*args, **kwargs) 2022-11-23T02:48:20.4450109Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4450228Z self.run_subtests( 2022-11-23T02:48:20.4450608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4450778Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4451155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4451314Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4451702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4451809Z output = model(*input) 2022-11-23T02:48:20.4452152Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4452300Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4452692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4452948Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4453340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4453468Z _lazy_init(state, module) 2022-11-23T02:48:20.4453827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4453956Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4454300Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4454435Z return func(*args, **kwargs) 2022-11-23T02:48:20.4454825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4454932Z p_assert( 2022-11-23T02:48:20.4455278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4455411Z traceback.print_stack() 2022-11-23T02:48:20.4455548Z File "", line 1, in 2022-11-23T02:48:20.4455747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4455893Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4456100Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4456254Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4456472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4456579Z self.run() 2022-11-23T02:48:20.4456789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4456940Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4457274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4457413Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4457789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4457919Z getattr(self, test_name)() 2022-11-23T02:48:20.4458288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4458390Z fn() 2022-11-23T02:48:20.4458763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4458871Z test(self, **param_kwargs) 2022-11-23T02:48:20.4459233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4459439Z return func(*args, **kwargs) 2022-11-23T02:48:20.4459703Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4459820Z self.run_subtests( 2022-11-23T02:48:20.4460191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4460363Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4460733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4460889Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4461253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4461378Z output = model(*input) 2022-11-23T02:48:20.4461718Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4461863Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4462250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4462488Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4462878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4463004Z _lazy_init(state, module) 2022-11-23T02:48:20.4463342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4463491Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4463836Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4463970Z return func(*args, **kwargs) 2022-11-23T02:48:20.4464358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4464466Z p_assert( 2022-11-23T02:48:20.4464813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4464947Z traceback.print_stack() 2022-11-23T02:48:20.4465172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4465416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4465550Z File "", line 1, in 2022-11-23T02:48:20.4465767Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4465911Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4466117Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4466275Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4466476Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4466584Z self.run() 2022-11-23T02:48:20.4466792Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4466944Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4467301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4467439Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4467813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4467937Z getattr(self, test_name)() 2022-11-23T02:48:20.4468285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4468388Z fn() 2022-11-23T02:48:20.4468834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4468963Z test(self, **param_kwargs) 2022-11-23T02:48:20.4469331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4469462Z return func(*args, **kwargs) 2022-11-23T02:48:20.4469730Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4469848Z self.run_subtests( 2022-11-23T02:48:20.4470192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4470361Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4470732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4470897Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4471282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4471406Z output = model(*input) 2022-11-23T02:48:20.4471741Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4471941Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4472321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4472499Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4472871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4472995Z _lazy_init(state, module) 2022-11-23T02:48:20.4473353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4473506Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4473859Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4473990Z return func(*args, **kwargs) 2022-11-23T02:48:20.4474414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4474522Z p_assert( 2022-11-23T02:48:20.4474870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4475000Z traceback.print_stack() 2022-11-23T02:48:20.4475422Z File "", line 1, in 2022-11-23T02:48:20.4475645Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4475791Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4476000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4476145Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4476364Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4476475Z self.run() 2022-11-23T02:48:20.4476681Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4476835Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4477188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4477325Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4477676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4477805Z getattr(self, test_name)() 2022-11-23T02:48:20.4478173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4478377Z fn() 2022-11-23T02:48:20.4478757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4478882Z test(self, **param_kwargs) 2022-11-23T02:48:20.4479243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4479377Z return func(*args, **kwargs) 2022-11-23T02:48:20.4479621Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4479738Z self.run_subtests( 2022-11-23T02:48:20.4480102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4480270Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4480639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4480804Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4481194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4481317Z output = model(*input) 2022-11-23T02:48:20.4481634Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4481845Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4482245Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4482428Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4482801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4482930Z _lazy_init(state, module) 2022-11-23T02:48:20.4483289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4483441Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4483771Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4483903Z return func(*args, **kwargs) 2022-11-23T02:48:20.4484296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4484405Z p_assert( 2022-11-23T02:48:20.4484751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4484884Z traceback.print_stack() 2022-11-23T02:48:20.4485127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4485373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4485488Z File "", line 1, in 2022-11-23T02:48:20.4485709Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4485853Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4486059Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4486214Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4486438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4486551Z self.run() 2022-11-23T02:48:20.4486759Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4486892Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4487247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4487385Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4487753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4487949Z getattr(self, test_name)() 2022-11-23T02:48:20.4488322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4488424Z fn() 2022-11-23T02:48:20.4488777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4488912Z test(self, **param_kwargs) 2022-11-23T02:48:20.4489278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4489407Z return func(*args, **kwargs) 2022-11-23T02:48:20.4489670Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4489786Z self.run_subtests( 2022-11-23T02:48:20.4490149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4490322Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4490676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4490835Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4491273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4491406Z output = model(*input) 2022-11-23T02:48:20.4491744Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4491888Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4492275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4492456Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4492840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4492945Z _lazy_init(state, module) 2022-11-23T02:48:20.4493305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4493454Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4493800Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4493930Z return func(*args, **kwargs) 2022-11-23T02:48:20.4494318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4494424Z p_assert( 2022-11-23T02:48:20.4494747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4494877Z traceback.print_stack() 2022-11-23T02:48:20.4495008Z File "", line 1, in 2022-11-23T02:48:20.4495229Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4495376Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4495585Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4495741Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4495961Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4496051Z self.run() 2022-11-23T02:48:20.4496257Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4496407Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4496756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4496894Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4497262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4497454Z getattr(self, test_name)() 2022-11-23T02:48:20.4497826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4497909Z fn() 2022-11-23T02:48:20.4498285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4498410Z test(self, **param_kwargs) 2022-11-23T02:48:20.4498775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4498906Z return func(*args, **kwargs) 2022-11-23T02:48:20.4499169Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4499287Z self.run_subtests( 2022-11-23T02:48:20.4499654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4499808Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4500185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4500343Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4500779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4500911Z output = model(*input) 2022-11-23T02:48:20.4501247Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4501393Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4501777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4501939Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4502318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4502444Z _lazy_init(state, module) 2022-11-23T02:48:20.4502807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4502954Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4503304Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4503436Z return func(*args, **kwargs) 2022-11-23T02:48:20.4503826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4503916Z p_assert( 2022-11-23T02:48:20.4504259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4504390Z traceback.print_stack() 2022-11-23T02:48:20.4504641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4504884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4505017Z File "", line 1, in 2022-11-23T02:48:20.4505232Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4505363Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4505571Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4505729Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4505947Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4506055Z self.run() 2022-11-23T02:48:20.4506262Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4506417Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4506855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4506974Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4507342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4507473Z getattr(self, test_name)() 2022-11-23T02:48:20.4507841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4507946Z fn() 2022-11-23T02:48:20.4508319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4508449Z test(self, **param_kwargs) 2022-11-23T02:48:20.4508815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4508926Z return func(*args, **kwargs) 2022-11-23T02:48:20.4509190Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4509313Z self.run_subtests( 2022-11-23T02:48:20.4509675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4509844Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4510265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4510433Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4510818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4510922Z output = model(*input) 2022-11-23T02:48:20.4511258Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4511404Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4511795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4511981Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4512358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4512490Z _lazy_init(state, module) 2022-11-23T02:48:20.4512852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4512981Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4513328Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4513459Z return func(*args, **kwargs) 2022-11-23T02:48:20.4513850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4513962Z p_assert( 2022-11-23T02:48:20.4514308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4514439Z traceback.print_stack() 2022-11-23T02:48:20.4514573Z File "", line 1, in 2022-11-23T02:48:20.4514771Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4514926Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4515326Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4515490Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4515708Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4515815Z self.run() 2022-11-23T02:48:20.4516026Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4516159Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4516616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4516755Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4517123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4517252Z getattr(self, test_name)() 2022-11-23T02:48:20.4517622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4517725Z fn() 2022-11-23T02:48:20.4518095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4518204Z test(self, **param_kwargs) 2022-11-23T02:48:20.4518569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4518699Z return func(*args, **kwargs) 2022-11-23T02:48:20.4518966Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4519084Z self.run_subtests( 2022-11-23T02:48:20.4519447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4519615Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4520048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4520199Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4520584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4520710Z output = model(*input) 2022-11-23T02:48:20.4521044Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4521188Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4521582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4521765Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4522143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4522256Z _lazy_init(state, module) 2022-11-23T02:48:20.4522616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4522765Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4523111Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4523240Z return func(*args, **kwargs) 2022-11-23T02:48:20.4523626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4523739Z p_assert( 2022-11-23T02:48:20.4524088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4524199Z traceback.print_stack() 2022-11-23T02:48:20.4524443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4524690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4524928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4525164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4525395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4525630Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4525865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4526147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4526380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4526612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4526848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4527082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4527314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4527546Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4527778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4528009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4528224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4528457Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4528687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4528967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4529205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4529437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4529668Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4529899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4530117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4530351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4530581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4530815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4531047Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4531278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4531507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4531736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4531963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4532180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4532411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4532640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4532874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4533105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4533336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4533565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4533794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4534004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4534300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4534529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4534758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4534992Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4535220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4535450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4535564Z dist init r=0, world=2 2022-11-23T02:48:20.4535885Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4536217Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4536534Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4536892Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4537211Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4537519Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4537826Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4538139Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4538448Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4538754Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4539062Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4539368Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4539469Z dist init r=1, world=2 2022-11-23T02:48:20.4539802Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4540126Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4540440Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4540751Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4541059Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4541425Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4541732Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4542043Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4542350Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4542656Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4542963Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4543254Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4543360Z ok (5.713s) 2022-11-23T02:48:20.4543764Z test_nested_always_wrap_model_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90145 2022-11-23T02:48:20.4543998Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90146 2022-11-23T02:48:20.4544389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4544571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4544965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4545169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4545548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4545710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4546101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4546299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4546549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.4546800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.4547209Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4547624Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4547861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.4548076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.4548324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4548563Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4549597Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4549782Z warnings.warn( 2022-11-23T02:48:20.4550814Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4550931Z warnings.warn( 2022-11-23T02:48:20.4551069Z File "", line 1, in 2022-11-23T02:48:20.4551287Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4551433Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4551641Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4551783Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4552000Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4552107Z self.run() 2022-11-23T02:48:20.4552316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4552571Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4552940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4553080Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4553436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4553564Z getattr(self, test_name)() 2022-11-23T02:48:20.4553935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4554044Z fn() 2022-11-23T02:48:20.4554423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4554554Z test(self, **param_kwargs) 2022-11-23T02:48:20.4554920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4555233Z return func(*args, **kwargs) 2022-11-23T02:48:20.4555493Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4555610Z self.run_subtests( 2022-11-23T02:48:20.4555981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4556149Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4556521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4556686Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4557073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4557200Z output = model(*input) 2022-11-23T02:48:20.4557514Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4557667Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4558052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4558235Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4558613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4558741Z _lazy_init(state, module) 2022-11-23T02:48:20.4559099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4559340Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4559691Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4559802Z return func(*args, **kwargs) 2022-11-23T02:48:20.4560195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4560303Z p_assert( 2022-11-23T02:48:20.4560652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4560782Z traceback.print_stack() 2022-11-23T02:48:20.4560916Z File "", line 1, in 2022-11-23T02:48:20.4561131Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4561259Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4561473Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4561627Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4561844Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4561951Z self.run() 2022-11-23T02:48:20.4562157Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4562371Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4562734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4562852Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4563218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4563352Z getattr(self, test_name)() 2022-11-23T02:48:20.4563719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4563828Z fn() 2022-11-23T02:48:20.4564201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4564327Z test(self, **param_kwargs) 2022-11-23T02:48:20.4564692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4564805Z return func(*args, **kwargs) 2022-11-23T02:48:20.4565072Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4565189Z self.run_subtests( 2022-11-23T02:48:20.4565552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4565719Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4566093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4566258Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4566643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4566748Z output = model(*input) 2022-11-23T02:48:20.4567088Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4567233Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4567621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4567805Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4568184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4568312Z _lazy_init(state, module) 2022-11-23T02:48:20.4568671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4568881Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4569231Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4569362Z return func(*args, **kwargs) 2022-11-23T02:48:20.4569751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4569859Z p_assert( 2022-11-23T02:48:20.4570205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4570335Z traceback.print_stack() 2022-11-23T02:48:20.4570579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4570805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4570943Z File "", line 1, in 2022-11-23T02:48:20.4571160Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4571306Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4571512Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4571668Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4571939Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4572038Z self.run() 2022-11-23T02:48:20.4572246Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4572396Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4572744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4572936Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4573310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4573444Z getattr(self, test_name)() 2022-11-23T02:48:20.4573815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4573899Z fn() 2022-11-23T02:48:20.4574312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4574439Z test(self, **param_kwargs) 2022-11-23T02:48:20.4574806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4574935Z return func(*args, **kwargs) 2022-11-23T02:48:20.4575196Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4575313Z self.run_subtests( 2022-11-23T02:48:20.4575676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4575831Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4576206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4576366Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4576752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4576878Z output = model(*input) 2022-11-23T02:48:20.4577215Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4577360Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4577746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4577909Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4578356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4578480Z _lazy_init(state, module) 2022-11-23T02:48:20.4578839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4578988Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4579338Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4579470Z return func(*args, **kwargs) 2022-11-23T02:48:20.4579859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4579948Z p_assert( 2022-11-23T02:48:20.4580295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4580425Z traceback.print_stack() 2022-11-23T02:48:20.4580565Z File "", line 1, in 2022-11-23T02:48:20.4580782Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4580931Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4581143Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4581299Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4581548Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4581666Z self.run() 2022-11-23T02:48:20.4581874Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4582025Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4582372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4582512Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4582883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4582996Z getattr(self, test_name)() 2022-11-23T02:48:20.4583367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4583471Z fn() 2022-11-23T02:48:20.4583849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4583976Z test(self, **param_kwargs) 2022-11-23T02:48:20.4584342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4584472Z return func(*args, **kwargs) 2022-11-23T02:48:20.4584735Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4584835Z self.run_subtests( 2022-11-23T02:48:20.4585198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4585372Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4585742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4585900Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4586288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4586414Z output = model(*input) 2022-11-23T02:48:20.4586749Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4586876Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4587261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4587442Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4587889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4588015Z _lazy_init(state, module) 2022-11-23T02:48:20.4588375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4588528Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4588878Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4588988Z return func(*args, **kwargs) 2022-11-23T02:48:20.4589377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4589485Z p_assert( 2022-11-23T02:48:20.4589830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4589963Z traceback.print_stack() 2022-11-23T02:48:20.4590210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4590452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4590587Z File "", line 1, in 2022-11-23T02:48:20.4590786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4590981Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4591199Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4591356Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4591574Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4591683Z self.run() 2022-11-23T02:48:20.4591889Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4592040Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4592381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4592516Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4592884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4593014Z getattr(self, test_name)() 2022-11-23T02:48:20.4593386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4593491Z fn() 2022-11-23T02:48:20.4593863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4593969Z test(self, **param_kwargs) 2022-11-23T02:48:20.4594336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4594468Z return func(*args, **kwargs) 2022-11-23T02:48:20.4594735Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4594850Z self.run_subtests( 2022-11-23T02:48:20.4595392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4595563Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4595945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4596102Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4596474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4596597Z output = model(*input) 2022-11-23T02:48:20.4596933Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4597163Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4597549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4597729Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4598108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4598234Z _lazy_init(state, module) 2022-11-23T02:48:20.4598574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4598722Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4599066Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4599192Z return func(*args, **kwargs) 2022-11-23T02:48:20.4599580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4599694Z p_assert( 2022-11-23T02:48:20.4600038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4600165Z traceback.print_stack() 2022-11-23T02:48:20.4600280Z File "", line 1, in 2022-11-23T02:48:20.4600558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4600713Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4600918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4601070Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4601287Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4601393Z self.run() 2022-11-23T02:48:20.4601580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4601728Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4602081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4602218Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4602586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4602712Z getattr(self, test_name)() 2022-11-23T02:48:20.4603079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4603178Z fn() 2022-11-23T02:48:20.4603531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4603657Z test(self, **param_kwargs) 2022-11-23T02:48:20.4604020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4604149Z return func(*args, **kwargs) 2022-11-23T02:48:20.4604413Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4604532Z self.run_subtests( 2022-11-23T02:48:20.4604893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4605062Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4605421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4605579Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4605965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4606089Z output = model(*input) 2022-11-23T02:48:20.4606423Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4606632Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4607019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4607198Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4607562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4607688Z _lazy_init(state, module) 2022-11-23T02:48:20.4608047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4608195Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4608540Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4608671Z return func(*args, **kwargs) 2022-11-23T02:48:20.4609062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4609173Z p_assert( 2022-11-23T02:48:20.4609502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4609634Z traceback.print_stack() 2022-11-23T02:48:20.4609878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4610168Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4610309Z File "", line 1, in 2022-11-23T02:48:20.4610525Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4610672Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4610879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4611015Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4611234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4611347Z self.run() 2022-11-23T02:48:20.4611552Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4611702Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4612054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4612194Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4612547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4612676Z getattr(self, test_name)() 2022-11-23T02:48:20.4613043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4613145Z fn() 2022-11-23T02:48:20.4613514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4613645Z test(self, **param_kwargs) 2022-11-23T02:48:20.4614008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4614136Z return func(*args, **kwargs) 2022-11-23T02:48:20.4614377Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4614497Z self.run_subtests( 2022-11-23T02:48:20.4614860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4615027Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4615402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4615557Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4615942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4616129Z output = model(*input) 2022-11-23T02:48:20.4616450Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4616593Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4616980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4617161Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4617535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4617659Z _lazy_init(state, module) 2022-11-23T02:48:20.4618015Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4618161Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4618489Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4618624Z return func(*args, **kwargs) 2022-11-23T02:48:20.4619011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4619114Z p_assert( 2022-11-23T02:48:20.4619505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4619642Z traceback.print_stack() 2022-11-23T02:48:20.4619774Z File "", line 1, in 2022-11-23T02:48:20.4619991Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4620119Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4620325Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4620480Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4620695Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4620803Z self.run() 2022-11-23T02:48:20.4621011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4621159Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4621498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4621632Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4622004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4622128Z getattr(self, test_name)() 2022-11-23T02:48:20.4622493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4622593Z fn() 2022-11-23T02:48:20.4622964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4623096Z test(self, **param_kwargs) 2022-11-23T02:48:20.4623446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4623572Z return func(*args, **kwargs) 2022-11-23T02:48:20.4623841Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4623959Z self.run_subtests( 2022-11-23T02:48:20.4624321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4624487Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4624857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4625018Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4625383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4625577Z output = model(*input) 2022-11-23T02:48:20.4625917Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4626061Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4626448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4626629Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4627003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4627130Z _lazy_init(state, module) 2022-11-23T02:48:20.4627470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4627614Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4627961Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4628086Z return func(*args, **kwargs) 2022-11-23T02:48:20.4628472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4628579Z p_assert( 2022-11-23T02:48:20.4628966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4629103Z traceback.print_stack() 2022-11-23T02:48:20.4629328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4629572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4629701Z File "", line 1, in 2022-11-23T02:48:20.4629917Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4630063Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4630273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4630429Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4630643Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4630731Z self.run() 2022-11-23T02:48:20.4630940Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4631088Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4631442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4631580Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4631943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4632069Z getattr(self, test_name)() 2022-11-23T02:48:20.4632433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4632522Z fn() 2022-11-23T02:48:20.4632893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4633017Z test(self, **param_kwargs) 2022-11-23T02:48:20.4633385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4633518Z return func(*args, **kwargs) 2022-11-23T02:48:20.4633782Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4633900Z self.run_subtests( 2022-11-23T02:48:20.4634260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4634409Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4634780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4635166Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4635584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4635704Z output = model(*input) 2022-11-23T02:48:20.4636042Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4636189Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4636573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4636737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4637115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4637242Z _lazy_init(state, module) 2022-11-23T02:48:20.4637601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4637747Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4638088Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4638293Z return func(*args, **kwargs) 2022-11-23T02:48:20.4638692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4638781Z p_assert( 2022-11-23T02:48:20.4639125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4639255Z traceback.print_stack() 2022-11-23T02:48:20.4639387Z File "", line 1, in 2022-11-23T02:48:20.4639603Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4639756Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4639964Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4640099Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4640318Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4640426Z self.run() 2022-11-23T02:48:20.4640636Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4640788Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4641138Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4641273Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4641645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4641752Z getattr(self, test_name)() 2022-11-23T02:48:20.4642121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4642221Z fn() 2022-11-23T02:48:20.4642592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4642717Z test(self, **param_kwargs) 2022-11-23T02:48:20.4643082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4643211Z return func(*args, **kwargs) 2022-11-23T02:48:20.4643472Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4643572Z self.run_subtests( 2022-11-23T02:48:20.4643933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4644099Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4644557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4644713Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4645096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4645214Z output = model(*input) 2022-11-23T02:48:20.4645551Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4645679Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4646067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4646249Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4646626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4646754Z _lazy_init(state, module) 2022-11-23T02:48:20.4647114Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4647263Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4647608Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4647768Z return func(*args, **kwargs) 2022-11-23T02:48:20.4648169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4648271Z p_assert( 2022-11-23T02:48:20.4648612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4648740Z traceback.print_stack() 2022-11-23T02:48:20.4648984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4649225Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4649369Z File "", line 1, in 2022-11-23T02:48:20.4649566Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4649712Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4649914Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4650066Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4650282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4650391Z self.run() 2022-11-23T02:48:20.4650597Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4650729Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4651081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4651219Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4651595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4651723Z getattr(self, test_name)() 2022-11-23T02:48:20.4652088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4652191Z fn() 2022-11-23T02:48:20.4652565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4652676Z test(self, **param_kwargs) 2022-11-23T02:48:20.4653045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4653172Z return func(*args, **kwargs) 2022-11-23T02:48:20.4653431Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4653548Z self.run_subtests( 2022-11-23T02:48:20.4653978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4654146Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4654519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4654663Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4655052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4655176Z output = model(*input) 2022-11-23T02:48:20.4655509Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4655654Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4656034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4656220Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4656595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4656702Z _lazy_init(state, module) 2022-11-23T02:48:20.4657112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4657267Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4657613Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4657744Z return func(*args, **kwargs) 2022-11-23T02:48:20.4658128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4658238Z p_assert( 2022-11-23T02:48:20.4658582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4658701Z traceback.print_stack() 2022-11-23T02:48:20.4658831Z File "", line 1, in 2022-11-23T02:48:20.4659049Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4659195Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4659405Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4659567Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4659786Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4659895Z self.run() 2022-11-23T02:48:20.4660083Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4660233Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4660584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4660725Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4661094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4661222Z getattr(self, test_name)() 2022-11-23T02:48:20.4661589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4661674Z fn() 2022-11-23T02:48:20.4662051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4662180Z test(self, **param_kwargs) 2022-11-23T02:48:20.4662543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4662672Z return func(*args, **kwargs) 2022-11-23T02:48:20.4662933Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4663111Z self.run_subtests( 2022-11-23T02:48:20.4663476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4663626Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4663999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4664160Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4664549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4664671Z output = model(*input) 2022-11-23T02:48:20.4665005Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4665149Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4665530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4665717Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4666077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4666203Z _lazy_init(state, module) 2022-11-23T02:48:20.4666612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4666768Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4667113Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4667238Z return func(*args, **kwargs) 2022-11-23T02:48:20.4667625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4667733Z p_assert( 2022-11-23T02:48:20.4668060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4668195Z traceback.print_stack() 2022-11-23T02:48:20.4668441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4668680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4668814Z File "", line 1, in 2022-11-23T02:48:20.4669031Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4669180Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4669368Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4669525Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4669741Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4669847Z self.run() 2022-11-23T02:48:20.4670050Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4670207Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4670561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4670698Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4671053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4671182Z getattr(self, test_name)() 2022-11-23T02:48:20.4671553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4671656Z fn() 2022-11-23T02:48:20.4672030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4672157Z test(self, **param_kwargs) 2022-11-23T02:48:20.4672524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4672713Z return func(*args, **kwargs) 2022-11-23T02:48:20.4672959Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4673078Z self.run_subtests( 2022-11-23T02:48:20.4673442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4673613Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4673984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4674142Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4674578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4674703Z output = model(*input) 2022-11-23T02:48:20.4675260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4675431Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4675822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4676000Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4676452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4676587Z _lazy_init(state, module) 2022-11-23T02:48:20.4676944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4677093Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4677424Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4677550Z return func(*args, **kwargs) 2022-11-23T02:48:20.4677947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4678052Z p_assert( 2022-11-23T02:48:20.4678397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4678526Z traceback.print_stack() 2022-11-23T02:48:20.4678661Z File "", line 1, in 2022-11-23T02:48:20.4678875Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4679004Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4679205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4679362Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4679583Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4679690Z self.run() 2022-11-23T02:48:20.4679899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4680055Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4680387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4680522Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4680894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4681025Z getattr(self, test_name)() 2022-11-23T02:48:20.4681394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4681496Z fn() 2022-11-23T02:48:20.4681867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4681994Z test(self, **param_kwargs) 2022-11-23T02:48:20.4682342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4682552Z return func(*args, **kwargs) 2022-11-23T02:48:20.4682815Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4682931Z self.run_subtests( 2022-11-23T02:48:20.4683303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4683470Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4683839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4683995Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4684360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4684485Z output = model(*input) 2022-11-23T02:48:20.4684820Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4684970Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4685353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4685532Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4685954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4686086Z _lazy_init(state, module) 2022-11-23T02:48:20.4686428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4686574Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4686914Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4687043Z return func(*args, **kwargs) 2022-11-23T02:48:20.4687437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4687546Z p_assert( 2022-11-23T02:48:20.4687891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4688018Z traceback.print_stack() 2022-11-23T02:48:20.4688246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4688489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4688621Z File "", line 1, in 2022-11-23T02:48:20.4688837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4688984Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4689193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4689349Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4689570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4689659Z self.run() 2022-11-23T02:48:20.4689862Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4690011Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4690359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4690497Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4690866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4690992Z getattr(self, test_name)() 2022-11-23T02:48:20.4691340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4691439Z fn() 2022-11-23T02:48:20.4691808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4691997Z test(self, **param_kwargs) 2022-11-23T02:48:20.4692364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4692494Z return func(*args, **kwargs) 2022-11-23T02:48:20.4692757Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4692870Z self.run_subtests( 2022-11-23T02:48:20.4693213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4693379Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4693744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4693900Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4694284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4694412Z output = model(*input) 2022-11-23T02:48:20.4694745Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4694891Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4695310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4695498Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4695874Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4695998Z _lazy_init(state, module) 2022-11-23T02:48:20.4696358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4696509Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4696857Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4696986Z return func(*args, **kwargs) 2022-11-23T02:48:20.4697355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4697464Z p_assert( 2022-11-23T02:48:20.4697811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4697939Z traceback.print_stack() 2022-11-23T02:48:20.4698071Z File "", line 1, in 2022-11-23T02:48:20.4698287Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4698433Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4698640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4698782Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4698999Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4699106Z self.run() 2022-11-23T02:48:20.4699311Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4699459Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4699811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4699949Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4700311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4700419Z getattr(self, test_name)() 2022-11-23T02:48:20.4700788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4700890Z fn() 2022-11-23T02:48:20.4701264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4701523Z test(self, **param_kwargs) 2022-11-23T02:48:20.4701892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4702021Z return func(*args, **kwargs) 2022-11-23T02:48:20.4702269Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4702385Z self.run_subtests( 2022-11-23T02:48:20.4702749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4702919Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4703289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4703444Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4703831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4703951Z output = model(*input) 2022-11-23T02:48:20.4704285Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4704412Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4704846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4705037Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4705411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4705537Z _lazy_init(state, module) 2022-11-23T02:48:20.4705895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4706047Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4706393Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4706504Z return func(*args, **kwargs) 2022-11-23T02:48:20.4706889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4706997Z p_assert( 2022-11-23T02:48:20.4707342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4707471Z traceback.print_stack() 2022-11-23T02:48:20.4707711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4707953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4708068Z File "", line 1, in 2022-11-23T02:48:20.4708284Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4708433Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4708640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4708793Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4709008Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4709118Z self.run() 2022-11-23T02:48:20.4709326Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4709458Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4709810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4709940Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4710307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4710493Z getattr(self, test_name)() 2022-11-23T02:48:20.4710866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4710967Z fn() 2022-11-23T02:48:20.4711336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4711445Z test(self, **param_kwargs) 2022-11-23T02:48:20.4711810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4711939Z return func(*args, **kwargs) 2022-11-23T02:48:20.4712198Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4712313Z self.run_subtests( 2022-11-23T02:48:20.4712674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4712840Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4713214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4713354Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4713740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4713911Z output = model(*input) 2022-11-23T02:48:20.4714256Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4714399Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4714777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4714956Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4715581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4715695Z _lazy_init(state, module) 2022-11-23T02:48:20.4716055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4716197Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4716543Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4716670Z return func(*args, **kwargs) 2022-11-23T02:48:20.4717059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4717165Z p_assert( 2022-11-23T02:48:20.4717508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4717620Z traceback.print_stack() 2022-11-23T02:48:20.4717754Z File "", line 1, in 2022-11-23T02:48:20.4717966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4718117Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4718322Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4718475Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4718696Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4718786Z self.run() 2022-11-23T02:48:20.4718993Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4719138Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4719487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4719622Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4719992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4720209Z getattr(self, test_name)() 2022-11-23T02:48:20.4720579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4720662Z fn() 2022-11-23T02:48:20.4721031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4721159Z test(self, **param_kwargs) 2022-11-23T02:48:20.4721519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4721648Z return func(*args, **kwargs) 2022-11-23T02:48:20.4721908Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4722023Z self.run_subtests( 2022-11-23T02:48:20.4722383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4722537Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4722908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4723060Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4723501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4723632Z output = model(*input) 2022-11-23T02:48:20.4723968Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4724111Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4724490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4724654Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4725026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4725155Z _lazy_init(state, module) 2022-11-23T02:48:20.4725508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4725652Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4725993Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4726121Z return func(*args, **kwargs) 2022-11-23T02:48:20.4726501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4726588Z p_assert( 2022-11-23T02:48:20.4726930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4727058Z traceback.print_stack() 2022-11-23T02:48:20.4727298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4727542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4727670Z File "", line 1, in 2022-11-23T02:48:20.4727881Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4728023Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4728216Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4728368Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4728580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4728678Z self.run() 2022-11-23T02:48:20.4728881Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4729029Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4729375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4729575Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4729932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4730053Z getattr(self, test_name)() 2022-11-23T02:48:20.4730418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4730514Z fn() 2022-11-23T02:48:20.4730885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4731008Z test(self, **param_kwargs) 2022-11-23T02:48:20.4731369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4731482Z return func(*args, **kwargs) 2022-11-23T02:48:20.4731736Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4731853Z self.run_subtests( 2022-11-23T02:48:20.4732215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4732379Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4732812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4732975Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4733358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4733464Z output = model(*input) 2022-11-23T02:48:20.4733794Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4733934Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4734314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4734499Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4734873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4734994Z _lazy_init(state, module) 2022-11-23T02:48:20.4735350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4735493Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4735823Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4735942Z return func(*args, **kwargs) 2022-11-23T02:48:20.4736320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4736422Z p_assert( 2022-11-23T02:48:20.4736765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4736894Z traceback.print_stack() 2022-11-23T02:48:20.4737026Z File "", line 1, in 2022-11-23T02:48:20.4737224Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4737364Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4737567Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4737719Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4737927Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4738028Z self.run() 2022-11-23T02:48:20.4738227Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4738372Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4738700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4738897Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4739260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4739382Z getattr(self, test_name)() 2022-11-23T02:48:20.4739748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4739847Z fn() 2022-11-23T02:48:20.4740214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4740332Z test(self, **param_kwargs) 2022-11-23T02:48:20.4740679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4740801Z return func(*args, **kwargs) 2022-11-23T02:48:20.4741058Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4741173Z self.run_subtests( 2022-11-23T02:48:20.4741530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4741693Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4742112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4742277Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4742646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4742766Z output = model(*input) 2022-11-23T02:48:20.4743090Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4743231Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4743605Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4743786Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4744157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4744282Z _lazy_init(state, module) 2022-11-23T02:48:20.4744628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4744774Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4745119Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4745245Z return func(*args, **kwargs) 2022-11-23T02:48:20.4745626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4745729Z p_assert( 2022-11-23T02:48:20.4746076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4746201Z traceback.print_stack() 2022-11-23T02:48:20.4746427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4746658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4746789Z File "", line 1, in 2022-11-23T02:48:20.4746995Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4747133Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4747335Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4747487Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4747685Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4747786Z self.run() 2022-11-23T02:48:20.4748051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4748191Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4748537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4748672Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4749032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4749159Z getattr(self, test_name)() 2022-11-23T02:48:20.4749509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4749601Z fn() 2022-11-23T02:48:20.4749965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4750088Z test(self, **param_kwargs) 2022-11-23T02:48:20.4750444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4750571Z return func(*args, **kwargs) 2022-11-23T02:48:20.4750830Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4750941Z self.run_subtests( 2022-11-23T02:48:20.4751336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4751508Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4751877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4752031Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4752418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4752543Z output = model(*input) 2022-11-23T02:48:20.4752880Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4753021Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4753388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4753571Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4753944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4754064Z _lazy_init(state, module) 2022-11-23T02:48:20.4754420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4754563Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4754903Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4755205Z return func(*args, **kwargs) 2022-11-23T02:48:20.4755594Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4755697Z p_assert( 2022-11-23T02:48:20.4756036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4756165Z traceback.print_stack() 2022-11-23T02:48:20.4756292Z File "", line 1, in 2022-11-23T02:48:20.4756504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4756647Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4756850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4756986Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4757196Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4757387Z self.run() 2022-11-23T02:48:20.4757591Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4757738Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4758082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4758214Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4758567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4758690Z getattr(self, test_name)() 2022-11-23T02:48:20.4759051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4759146Z fn() 2022-11-23T02:48:20.4759517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4759638Z test(self, **param_kwargs) 2022-11-23T02:48:20.4760000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4760119Z return func(*args, **kwargs) 2022-11-23T02:48:20.4760365Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4760481Z self.run_subtests( 2022-11-23T02:48:20.4760902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4761077Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4761446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4761595Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4761976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4762103Z output = model(*input) 2022-11-23T02:48:20.4762417Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4762556Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4762934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4763115Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4763489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4763606Z _lazy_init(state, module) 2022-11-23T02:48:20.4763961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4764108Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4764432Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4764566Z return func(*args, **kwargs) 2022-11-23T02:48:20.4764949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4765049Z p_assert( 2022-11-23T02:48:20.4765388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4765518Z traceback.print_stack() 2022-11-23T02:48:20.4765754Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4765986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4766100Z File "", line 1, in 2022-11-23T02:48:20.4766309Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4766446Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4766644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4766870Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4767085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4767190Z self.run() 2022-11-23T02:48:20.4767398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4767533Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4767880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4768011Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4768378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4768499Z getattr(self, test_name)() 2022-11-23T02:48:20.4768854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4768951Z fn() 2022-11-23T02:48:20.4769321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4769430Z test(self, **param_kwargs) 2022-11-23T02:48:20.4769788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4769965Z return func(*args, **kwargs) 2022-11-23T02:48:20.4770232Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4770345Z self.run_subtests( 2022-11-23T02:48:20.4770707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4770869Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4771232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4771378Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4771760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4771882Z output = model(*input) 2022-11-23T02:48:20.4772208Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4772355Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4772739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4772916Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4773286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4773394Z _lazy_init(state, module) 2022-11-23T02:48:20.4773745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4773894Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4774232Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4774398Z return func(*args, **kwargs) 2022-11-23T02:48:20.4774788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4774887Z p_assert( 2022-11-23T02:48:20.4775228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4775339Z traceback.print_stack() 2022-11-23T02:48:20.4775466Z File "", line 1, in 2022-11-23T02:48:20.4775674Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4775812Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4776015Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4776238Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4776456Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4776543Z self.run() 2022-11-23T02:48:20.4776749Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4776898Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4777243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4777372Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4777733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4777851Z getattr(self, test_name)() 2022-11-23T02:48:20.4778215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4778304Z fn() 2022-11-23T02:48:20.4778668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4778792Z test(self, **param_kwargs) 2022-11-23T02:48:20.4779155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4779330Z return func(*args, **kwargs) 2022-11-23T02:48:20.4779593Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4779699Z self.run_subtests( 2022-11-23T02:48:20.4780059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4780210Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4780578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4780731Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4781110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4781231Z output = model(*input) 2022-11-23T02:48:20.4781561Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4781706Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4782090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4782251Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4782624Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4782748Z _lazy_init(state, module) 2022-11-23T02:48:20.4783098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4783245Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4783583Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4783706Z return func(*args, **kwargs) 2022-11-23T02:48:20.4784091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4784181Z p_assert( 2022-11-23T02:48:20.4784522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4784647Z traceback.print_stack() 2022-11-23T02:48:20.4784886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4785122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4785352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4785651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4785883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4786099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4786334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4786570Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4786801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4787028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4787253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4787484Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4787713Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4787927Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4788208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4788441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4788668Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4788892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4789114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4789336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4789569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4789790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4790001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4790229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4790458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4790685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4790913Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4791143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4791368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4791598Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4791808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4792033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4792260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4792488Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4792715Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4792945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4793173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4793395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4793666Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4793891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4794111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4794334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4794556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4794780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4795003Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4795466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4795702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4795912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4796027Z dist init r=1, world=2 2022-11-23T02:48:20.4796438Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4796773Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4797082Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4797392Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4797701Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4798007Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4798311Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4798616Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4798912Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4799200Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4799506Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4799807Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.4799921Z dist init r=0, world=2 2022-11-23T02:48:20.4800249Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4800571Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4800876Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4801254Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4801563Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4801858Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4802164Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4802453Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4802762Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4803107Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4803424Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4803726Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.4803821Z ok (6.114s) 2022-11-23T02:48:20.4804182Z test_nested_always_wrap_model_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90228 2022-11-23T02:48:20.4804411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90229 2022-11-23T02:48:20.4804805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4804980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4805357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4805552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4805927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.4806104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.4806487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.4806687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.4806935Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.4807183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.4807597Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4807988Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.4808222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.4808450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.4808683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4808979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4810008Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4810123Z warnings.warn( 2022-11-23T02:48:20.4811138Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.4811253Z warnings.warn( 2022-11-23T02:48:20.4811382Z File "", line 1, in 2022-11-23T02:48:20.4811600Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4811776Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4811987Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4812139Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4812353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4812458Z self.run() 2022-11-23T02:48:20.4812663Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4812808Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4813147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4813286Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4813651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4813775Z getattr(self, test_name)() 2022-11-23T02:48:20.4814145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4814243Z fn() 2022-11-23T02:48:20.4814608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4814728Z test(self, **param_kwargs) 2022-11-23T02:48:20.4815077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4815200Z return func(*args, **kwargs) 2022-11-23T02:48:20.4815456Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4815574Z self.run_subtests( 2022-11-23T02:48:20.4815932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4816090Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4816458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4816610Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4816977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4817093Z output = model(*input) 2022-11-23T02:48:20.4817427Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4817569Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4818025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4818203Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4818578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4818702Z _lazy_init(state, module) 2022-11-23T02:48:20.4819046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4819193Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4819537Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4819662Z return func(*args, **kwargs) 2022-11-23T02:48:20.4820043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4820153Z p_assert( 2022-11-23T02:48:20.4820493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4820620Z traceback.print_stack() 2022-11-23T02:48:20.4820734Z File "", line 1, in 2022-11-23T02:48:20.4820947Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4821136Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4821351Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4821504Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4821719Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4821824Z self.run() 2022-11-23T02:48:20.4822012Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4822159Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4822513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4822648Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4823014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4823134Z getattr(self, test_name)() 2022-11-23T02:48:20.4823495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4823598Z fn() 2022-11-23T02:48:20.4823953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4824078Z test(self, **param_kwargs) 2022-11-23T02:48:20.4824442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4824566Z return func(*args, **kwargs) 2022-11-23T02:48:20.4824825Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4824948Z self.run_subtests( 2022-11-23T02:48:20.4825309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4825477Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4825835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4825990Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4826370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4826484Z output = model(*input) 2022-11-23T02:48:20.4826807Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4826945Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4827410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4827589Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4827946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4828074Z _lazy_init(state, module) 2022-11-23T02:48:20.4828436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4828579Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4828918Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4829043Z return func(*args, **kwargs) 2022-11-23T02:48:20.4829426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4829532Z p_assert( 2022-11-23T02:48:20.4829862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4829990Z traceback.print_stack() 2022-11-23T02:48:20.4830231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4830518Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4830654Z File "", line 1, in 2022-11-23T02:48:20.4830863Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4831004Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4831201Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4831338Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4831554Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4831659Z self.run() 2022-11-23T02:48:20.4831860Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4832008Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4832354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4832483Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4832848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4832958Z getattr(self, test_name)() 2022-11-23T02:48:20.4833320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4833414Z fn() 2022-11-23T02:48:20.4833777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4833901Z test(self, **param_kwargs) 2022-11-23T02:48:20.4834264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4834390Z return func(*args, **kwargs) 2022-11-23T02:48:20.4834633Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4834746Z self.run_subtests( 2022-11-23T02:48:20.4835273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4835448Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4835816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4835971Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4836349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4836552Z output = model(*input) 2022-11-23T02:48:20.4836885Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4837011Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4837384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4837560Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4837927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4838047Z _lazy_init(state, module) 2022-11-23T02:48:20.4838396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4838535Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4838880Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4838995Z return func(*args, **kwargs) 2022-11-23T02:48:20.4839373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4839470Z p_assert( 2022-11-23T02:48:20.4839925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4840058Z traceback.print_stack() 2022-11-23T02:48:20.4840179Z File "", line 1, in 2022-11-23T02:48:20.4840389Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4840516Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4840716Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4840868Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4841082Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4841191Z self.run() 2022-11-23T02:48:20.4841399Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4841549Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4841900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4842021Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4842389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4842514Z getattr(self, test_name)() 2022-11-23T02:48:20.4842876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4842977Z fn() 2022-11-23T02:48:20.4843350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4843483Z test(self, **param_kwargs) 2022-11-23T02:48:20.4843849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4843960Z return func(*args, **kwargs) 2022-11-23T02:48:20.4844220Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4844337Z self.run_subtests( 2022-11-23T02:48:20.4844701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4844864Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4845229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4845382Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4845763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4845930Z output = model(*input) 2022-11-23T02:48:20.4846259Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4846399Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4846783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4846964Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4847335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4847455Z _lazy_init(state, module) 2022-11-23T02:48:20.4847805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4847936Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4848279Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4848410Z return func(*args, **kwargs) 2022-11-23T02:48:20.4848799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4848901Z p_assert( 2022-11-23T02:48:20.4849291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4849429Z traceback.print_stack() 2022-11-23T02:48:20.4849673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4849897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4850023Z File "", line 1, in 2022-11-23T02:48:20.4850236Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4850374Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4850578Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4850724Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4850937Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4851026Z self.run() 2022-11-23T02:48:20.4851225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4851372Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4851722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4851852Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4852220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4852343Z getattr(self, test_name)() 2022-11-23T02:48:20.4852706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4852796Z fn() 2022-11-23T02:48:20.4853166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4853287Z test(self, **param_kwargs) 2022-11-23T02:48:20.4853645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4853768Z return func(*args, **kwargs) 2022-11-23T02:48:20.4854026Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4854142Z self.run_subtests( 2022-11-23T02:48:20.4854504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4854654Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4855018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4855238Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4855617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4855737Z output = model(*input) 2022-11-23T02:48:20.4856068Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4856207Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4856583Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4856749Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4857119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4857242Z _lazy_init(state, module) 2022-11-23T02:48:20.4857606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4857747Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4858092Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4858221Z return func(*args, **kwargs) 2022-11-23T02:48:20.4858644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4858741Z p_assert( 2022-11-23T02:48:20.4859089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4859214Z traceback.print_stack() 2022-11-23T02:48:20.4859344Z File "", line 1, in 2022-11-23T02:48:20.4859552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4859697Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4859913Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4860066Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4860265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4860369Z self.run() 2022-11-23T02:48:20.4860578Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4860726Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4861073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4861200Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4861564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4861676Z getattr(self, test_name)() 2022-11-23T02:48:20.4862043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4862147Z fn() 2022-11-23T02:48:20.4862519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4862641Z test(self, **param_kwargs) 2022-11-23T02:48:20.4863002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4863128Z return func(*args, **kwargs) 2022-11-23T02:48:20.4863382Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4863480Z self.run_subtests( 2022-11-23T02:48:20.4863832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4863996Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4864361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4864577Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4864961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4865085Z output = model(*input) 2022-11-23T02:48:20.4865425Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4865552Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4865934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4866109Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4866479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4866596Z _lazy_init(state, module) 2022-11-23T02:48:20.4866952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4867095Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4867433Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4867545Z return func(*args, **kwargs) 2022-11-23T02:48:20.4867976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4868085Z p_assert( 2022-11-23T02:48:20.4868423Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4868546Z traceback.print_stack() 2022-11-23T02:48:20.4868781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4869012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4869148Z File "", line 1, in 2022-11-23T02:48:20.4869349Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4869490Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4869687Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4869839Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4870054Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4870156Z self.run() 2022-11-23T02:48:20.4870354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4870495Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4870826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4870954Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4871314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4871439Z getattr(self, test_name)() 2022-11-23T02:48:20.4871800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4871898Z fn() 2022-11-23T02:48:20.4872270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4872396Z test(self, **param_kwargs) 2022-11-23T02:48:20.4872744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4872917Z return func(*args, **kwargs) 2022-11-23T02:48:20.4873180Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4873292Z self.run_subtests( 2022-11-23T02:48:20.4873647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4873873Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4874243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4874427Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4874797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4874916Z output = model(*input) 2022-11-23T02:48:20.4875423Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4875564Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4875944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4876118Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4876499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4876619Z _lazy_init(state, module) 2022-11-23T02:48:20.4876959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4877176Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4877532Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4877655Z return func(*args, **kwargs) 2022-11-23T02:48:20.4878036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4878141Z p_assert( 2022-11-23T02:48:20.4878482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4878616Z traceback.print_stack() 2022-11-23T02:48:20.4878732Z File "", line 1, in 2022-11-23T02:48:20.4878942Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4879080Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4879281Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4879432Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4879642Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4879740Z self.run() 2022-11-23T02:48:20.4879928Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4880073Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4880423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4880557Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4880927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4881051Z getattr(self, test_name)() 2022-11-23T02:48:20.4881407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4881503Z fn() 2022-11-23T02:48:20.4881865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4881990Z test(self, **param_kwargs) 2022-11-23T02:48:20.4882353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4882481Z return func(*args, **kwargs) 2022-11-23T02:48:20.4882741Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4882853Z self.run_subtests( 2022-11-23T02:48:20.4883289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4883453Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4883805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4883955Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4884336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4884458Z output = model(*input) 2022-11-23T02:48:20.4884789Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4884925Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4885300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4885481Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4885838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4885954Z _lazy_init(state, module) 2022-11-23T02:48:20.4886303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4886495Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4886852Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4886971Z return func(*args, **kwargs) 2022-11-23T02:48:20.4887348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4887452Z p_assert( 2022-11-23T02:48:20.4887779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4887914Z traceback.print_stack() 2022-11-23T02:48:20.4888150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4888392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4888522Z File "", line 1, in 2022-11-23T02:48:20.4888735Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4888879Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4889081Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4889217Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4889431Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4889537Z self.run() 2022-11-23T02:48:20.4889740Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4889882Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4890235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4890369Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4890720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4890843Z getattr(self, test_name)() 2022-11-23T02:48:20.4891200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4891293Z fn() 2022-11-23T02:48:20.4891661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4891782Z test(self, **param_kwargs) 2022-11-23T02:48:20.4892139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4892263Z return func(*args, **kwargs) 2022-11-23T02:48:20.4892588Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4892702Z self.run_subtests( 2022-11-23T02:48:20.4893062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4893231Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4893597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4893754Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4894135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4894256Z output = model(*input) 2022-11-23T02:48:20.4894575Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4894721Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4895106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4895282Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4895705Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4895835Z _lazy_init(state, module) 2022-11-23T02:48:20.4896189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4896338Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4896662Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4896789Z return func(*args, **kwargs) 2022-11-23T02:48:20.4897174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4897288Z p_assert( 2022-11-23T02:48:20.4897627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4897757Z traceback.print_stack() 2022-11-23T02:48:20.4897888Z File "", line 1, in 2022-11-23T02:48:20.4898107Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4898237Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4898440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4898593Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4898800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4898901Z self.run() 2022-11-23T02:48:20.4899102Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4899253Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4899595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4899714Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4900080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4900206Z getattr(self, test_name)() 2022-11-23T02:48:20.4900572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4900669Z fn() 2022-11-23T02:48:20.4901035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4901158Z test(self, **param_kwargs) 2022-11-23T02:48:20.4901504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4901691Z return func(*args, **kwargs) 2022-11-23T02:48:20.4901951Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4902064Z self.run_subtests( 2022-11-23T02:48:20.4902420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4902588Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4902956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4903111Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4903490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4903597Z output = model(*input) 2022-11-23T02:48:20.4903926Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4904071Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4904457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4904640Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4905064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4905194Z _lazy_init(state, module) 2022-11-23T02:48:20.4905556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4905685Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4906031Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4906160Z return func(*args, **kwargs) 2022-11-23T02:48:20.4906543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4906649Z p_assert( 2022-11-23T02:48:20.4906987Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4907116Z traceback.print_stack() 2022-11-23T02:48:20.4907343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4907584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4907712Z File "", line 1, in 2022-11-23T02:48:20.4907926Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4908066Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4908274Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4908425Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4908643Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4908734Z self.run() 2022-11-23T02:48:20.4908931Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4909075Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4909424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4909556Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4909925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4910045Z getattr(self, test_name)() 2022-11-23T02:48:20.4910409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4910491Z fn() 2022-11-23T02:48:20.4910858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4911044Z test(self, **param_kwargs) 2022-11-23T02:48:20.4911405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4911528Z return func(*args, **kwargs) 2022-11-23T02:48:20.4911791Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4911901Z self.run_subtests( 2022-11-23T02:48:20.4912259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4912411Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4912778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4912927Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4913304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4913426Z output = model(*input) 2022-11-23T02:48:20.4913754Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4913898Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4914331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4914507Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4914880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4915002Z _lazy_init(state, module) 2022-11-23T02:48:20.4915609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4915750Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4916098Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4916218Z return func(*args, **kwargs) 2022-11-23T02:48:20.4916596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4916683Z p_assert( 2022-11-23T02:48:20.4917028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4917152Z traceback.print_stack() 2022-11-23T02:48:20.4917281Z File "", line 1, in 2022-11-23T02:48:20.4917494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4917638Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4917837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4917973Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4918194Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4918289Z self.run() 2022-11-23T02:48:20.4918488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4918634Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4918980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4919112Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4919480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4919588Z getattr(self, test_name)() 2022-11-23T02:48:20.4919957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4920052Z fn() 2022-11-23T02:48:20.4920416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4920635Z test(self, **param_kwargs) 2022-11-23T02:48:20.4920993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4921110Z return func(*args, **kwargs) 2022-11-23T02:48:20.4921372Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4921469Z self.run_subtests( 2022-11-23T02:48:20.4921831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4921999Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4922369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4922525Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4922908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4923038Z output = model(*input) 2022-11-23T02:48:20.4923372Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4923500Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4923938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4924127Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4924499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4924620Z _lazy_init(state, module) 2022-11-23T02:48:20.4924971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4925116Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4925467Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4925579Z return func(*args, **kwargs) 2022-11-23T02:48:20.4925959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4926062Z p_assert( 2022-11-23T02:48:20.4926403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4926531Z traceback.print_stack() 2022-11-23T02:48:20.4926766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4927008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4927140Z File "", line 1, in 2022-11-23T02:48:20.4927339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4927487Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4927683Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4927837Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4928047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4928146Z self.run() 2022-11-23T02:48:20.4928351Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4928484Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4928827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4928960Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4929322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4929441Z getattr(self, test_name)() 2022-11-23T02:48:20.4929872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4929972Z fn() 2022-11-23T02:48:20.4930338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4930448Z test(self, **param_kwargs) 2022-11-23T02:48:20.4930812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4930941Z return func(*args, **kwargs) 2022-11-23T02:48:20.4931197Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4931306Z self.run_subtests( 2022-11-23T02:48:20.4931658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4931817Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4932191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4932328Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4932712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4932834Z output = model(*input) 2022-11-23T02:48:20.4933222Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4933375Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4933760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4933935Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4934307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4934430Z _lazy_init(state, module) 2022-11-23T02:48:20.4934776Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4934919Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4935260Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4935385Z return func(*args, **kwargs) 2022-11-23T02:48:20.4935509Z File "", line 1, in 2022-11-23T02:48:20.4935893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4935991Z p_assert( 2022-11-23T02:48:20.4936317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4936443Z traceback.print_stack() 2022-11-23T02:48:20.4936647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4936795Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4936996Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4937149Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4937358Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4937462Z self.run() 2022-11-23T02:48:20.4937654Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4937803Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4938148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4938279Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4938644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4938763Z getattr(self, test_name)() 2022-11-23T02:48:20.4939198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4939295Z fn() 2022-11-23T02:48:20.4939649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4939766Z test(self, **param_kwargs) 2022-11-23T02:48:20.4940126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4940252Z return func(*args, **kwargs) 2022-11-23T02:48:20.4940509Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4940622Z self.run_subtests( 2022-11-23T02:48:20.4940982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4941150Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4941509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4941664Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4942049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4942214Z output = model(*input) 2022-11-23T02:48:20.4942553Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4942690Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4943072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4943250Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4943606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4943735Z _lazy_init(state, module) 2022-11-23T02:48:20.4944091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4944234Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4944577Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4944709Z return func(*args, **kwargs) 2022-11-23T02:48:20.4945092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4945199Z p_assert( 2022-11-23T02:48:20.4945528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4945659Z traceback.print_stack() 2022-11-23T02:48:20.4945904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4946152Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4946283Z File "", line 1, in 2022-11-23T02:48:20.4946495Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4946637Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4946828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4946975Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4947190Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4947293Z self.run() 2022-11-23T02:48:20.4947495Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4947637Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4947983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4948179Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4948535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4948653Z getattr(self, test_name)() 2022-11-23T02:48:20.4949013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4949112Z fn() 2022-11-23T02:48:20.4949480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4949601Z test(self, **param_kwargs) 2022-11-23T02:48:20.4949960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4950089Z return func(*args, **kwargs) 2022-11-23T02:48:20.4950332Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4950452Z self.run_subtests( 2022-11-23T02:48:20.4950806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4950965Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4951326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4951526Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4951916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4952035Z output = model(*input) 2022-11-23T02:48:20.4952349Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4952487Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4952861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4953039Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4953403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4953526Z _lazy_init(state, module) 2022-11-23T02:48:20.4953880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4954022Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4954350Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4954476Z return func(*args, **kwargs) 2022-11-23T02:48:20.4954859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4954956Z p_assert( 2022-11-23T02:48:20.4955486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4955615Z traceback.print_stack() 2022-11-23T02:48:20.4955739Z File "", line 1, in 2022-11-23T02:48:20.4955949Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4956077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4956286Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4956437Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4956652Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4956751Z self.run() 2022-11-23T02:48:20.4956950Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4957097Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4957429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4957670Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4958042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4958163Z getattr(self, test_name)() 2022-11-23T02:48:20.4958524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4958628Z fn() 2022-11-23T02:48:20.4958992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4959117Z test(self, **param_kwargs) 2022-11-23T02:48:20.4959463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4959586Z return func(*args, **kwargs) 2022-11-23T02:48:20.4959838Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4959954Z self.run_subtests( 2022-11-23T02:48:20.4960312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4960480Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4960910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4961075Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4961444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4961558Z output = model(*input) 2022-11-23T02:48:20.4961887Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4962031Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4962407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4962596Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4962969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4963090Z _lazy_init(state, module) 2022-11-23T02:48:20.4963435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4963578Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4963918Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4964042Z return func(*args, **kwargs) 2022-11-23T02:48:20.4964435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4964538Z p_assert( 2022-11-23T02:48:20.4964876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4965009Z traceback.print_stack() 2022-11-23T02:48:20.4965233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4965475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4965609Z File "", line 1, in 2022-11-23T02:48:20.4965814Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4965955Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4966152Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4966302Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4966512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4966602Z self.run() 2022-11-23T02:48:20.4966805Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4967012Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4967364Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4967499Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4967869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4967989Z getattr(self, test_name)() 2022-11-23T02:48:20.4968352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4968436Z fn() 2022-11-23T02:48:20.4968809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4968932Z test(self, **param_kwargs) 2022-11-23T02:48:20.4969293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4969421Z return func(*args, **kwargs) 2022-11-23T02:48:20.4969676Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4969790Z self.run_subtests( 2022-11-23T02:48:20.4970184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4970361Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4970740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4970898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4971277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4971397Z output = model(*input) 2022-11-23T02:48:20.4971727Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4971875Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4972260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4972425Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4972808Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4972930Z _lazy_init(state, module) 2022-11-23T02:48:20.4973290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4973434Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4973773Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4973896Z return func(*args, **kwargs) 2022-11-23T02:48:20.4974288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4974425Z p_assert( 2022-11-23T02:48:20.4974766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4974964Z traceback.print_stack() 2022-11-23T02:48:20.4975098Z File "", line 1, in 2022-11-23T02:48:20.4975306Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4975441Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4975637Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4975774Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4975983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4976081Z self.run() 2022-11-23T02:48:20.4976280Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4976489Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4976833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4976973Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4977341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4977453Z getattr(self, test_name)() 2022-11-23T02:48:20.4977817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4977910Z fn() 2022-11-23T02:48:20.4978271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4978389Z test(self, **param_kwargs) 2022-11-23T02:48:20.4978744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4978870Z return func(*args, **kwargs) 2022-11-23T02:48:20.4979129Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4979229Z self.run_subtests( 2022-11-23T02:48:20.4979640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4979810Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4980176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4980326Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4980704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4980826Z output = model(*input) 2022-11-23T02:48:20.4981167Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4981297Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4981671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4981853Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4982230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4982352Z _lazy_init(state, module) 2022-11-23T02:48:20.4982709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4982848Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4983193Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4983304Z return func(*args, **kwargs) 2022-11-23T02:48:20.4983692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4983795Z p_assert( 2022-11-23T02:48:20.4984136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4984265Z traceback.print_stack() 2022-11-23T02:48:20.4984509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4984748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.4984872Z File "", line 1, in 2022-11-23T02:48:20.4985070Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4985205Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4985404Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4985672Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4985884Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4985987Z self.run() 2022-11-23T02:48:20.4986188Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4986319Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4986671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4986802Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4987167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4987287Z getattr(self, test_name)() 2022-11-23T02:48:20.4987650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4987751Z fn() 2022-11-23T02:48:20.4988125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4988236Z test(self, **param_kwargs) 2022-11-23T02:48:20.4988596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4988722Z return func(*args, **kwargs) 2022-11-23T02:48:20.4989033Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4989155Z self.run_subtests( 2022-11-23T02:48:20.4989509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4989671Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4990039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4990177Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.4990566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.4990681Z output = model(*input) 2022-11-23T02:48:20.4991008Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.4991155Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.4991537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.4991714Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.4992083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.4992192Z _lazy_init(state, module) 2022-11-23T02:48:20.4992540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.4992687Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.4993033Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.4993156Z return func(*args, **kwargs) 2022-11-23T02:48:20.4993541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.4993649Z p_assert( 2022-11-23T02:48:20.4993994Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.4994106Z traceback.print_stack() 2022-11-23T02:48:20.4994240Z File "", line 1, in 2022-11-23T02:48:20.4994451Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.4994589Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.4994793Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.4995009Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.4995410Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.4995515Z self.run() 2022-11-23T02:48:20.4995704Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.4995852Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.4996203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.4996336Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.4996702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.4996827Z getattr(self, test_name)() 2022-11-23T02:48:20.4997186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.4997271Z fn() 2022-11-23T02:48:20.4997647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.4997764Z test(self, **param_kwargs) 2022-11-23T02:48:20.4998121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.4998249Z return func(*args, **kwargs) 2022-11-23T02:48:20.4998582Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.4998704Z self.run_subtests( 2022-11-23T02:48:20.4999061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.4999209Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.4999575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.4999735Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5000119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5000239Z output = model(*input) 2022-11-23T02:48:20.5000571Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5000718Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5001101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5001265Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5001637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5001754Z _lazy_init(state, module) 2022-11-23T02:48:20.5002102Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.5002250Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5002590Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5002713Z return func(*args, **kwargs) 2022-11-23T02:48:20.5003101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5003190Z p_assert( 2022-11-23T02:48:20.5003530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5003655Z traceback.print_stack() 2022-11-23T02:48:20.5003896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5004132Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5004254Z File "", line 1, in 2022-11-23T02:48:20.5004542Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5004682Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5004870Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5005016Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5005228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5005329Z self.run() 2022-11-23T02:48:20.5005530Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5005676Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5006024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5006150Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5006501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5006627Z getattr(self, test_name)() 2022-11-23T02:48:20.5006989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5007089Z fn() 2022-11-23T02:48:20.5007455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5007627Z test(self, **param_kwargs) 2022-11-23T02:48:20.5007998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5008117Z return func(*args, **kwargs) 2022-11-23T02:48:20.5008362Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.5008472Z self.run_subtests( 2022-11-23T02:48:20.5008823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5008990Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5009354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5009510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5009896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5010022Z output = model(*input) 2022-11-23T02:48:20.5010341Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5010484Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5010863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5011036Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5011402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5011527Z _lazy_init(state, module) 2022-11-23T02:48:20.5011879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.5012025Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5012354Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5012479Z return func(*args, **kwargs) 2022-11-23T02:48:20.5012859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5012955Z p_assert( 2022-11-23T02:48:20.5013298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5013421Z traceback.print_stack() 2022-11-23T02:48:20.5013548Z File "", line 1, in 2022-11-23T02:48:20.5013806Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5013947Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5014148Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5014295Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5014510Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5014617Z self.run() 2022-11-23T02:48:20.5014821Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5014968Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5015306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5015437Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5015799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5015921Z getattr(self, test_name)() 2022-11-23T02:48:20.5016282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5016381Z fn() 2022-11-23T02:48:20.5016744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5016915Z test(self, **param_kwargs) 2022-11-23T02:48:20.5017277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5017397Z return func(*args, **kwargs) 2022-11-23T02:48:20.5017652Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.5017764Z self.run_subtests( 2022-11-23T02:48:20.5018121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5018292Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5018664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5018818Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5019190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5019309Z output = model(*input) 2022-11-23T02:48:20.5019633Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5019777Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5020160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5020336Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5020709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5020835Z _lazy_init(state, module) 2022-11-23T02:48:20.5021176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.5021320Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5021668Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5021792Z return func(*args, **kwargs) 2022-11-23T02:48:20.5022167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5022263Z p_assert( 2022-11-23T02:48:20.5022601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5022725Z traceback.print_stack() 2022-11-23T02:48:20.5022950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5023264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5023393Z File "", line 1, in 2022-11-23T02:48:20.5023609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5023747Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5023954Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5024106Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5024321Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5024410Z self.run() 2022-11-23T02:48:20.5024613Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5024755Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5025105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5025245Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5025617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5025738Z getattr(self, test_name)() 2022-11-23T02:48:20.5026134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5026239Z fn() 2022-11-23T02:48:20.5026608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5026729Z test(self, **param_kwargs) 2022-11-23T02:48:20.5027087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5027205Z return func(*args, **kwargs) 2022-11-23T02:48:20.5027464Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.5027588Z self.run_subtests( 2022-11-23T02:48:20.5027934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5028096Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5028470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5028627Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5029003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5029123Z output = model(*input) 2022-11-23T02:48:20.5029455Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5029594Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5029964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5030145Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5030517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5030632Z _lazy_init(state, module) 2022-11-23T02:48:20.5030989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.5031135Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5031473Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5031593Z return func(*args, **kwargs) 2022-11-23T02:48:20.5031962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5032123Z p_assert( 2022-11-23T02:48:20.5032467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5032597Z traceback.print_stack() 2022-11-23T02:48:20.5032727Z File "", line 1, in 2022-11-23T02:48:20.5032934Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5033077Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5033280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5033419Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5033633Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5033737Z self.run() 2022-11-23T02:48:20.5033936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5034081Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5034426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5034560Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5034921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5035250Z getattr(self, test_name)() 2022-11-23T02:48:20.5035715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5035826Z fn() 2022-11-23T02:48:20.5036203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5036327Z test(self, **param_kwargs) 2022-11-23T02:48:20.5036686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5036808Z return func(*args, **kwargs) 2022-11-23T02:48:20.5037054Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:48:20.5037177Z self.run_subtests( 2022-11-23T02:48:20.5037538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5037703Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5038077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5038237Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5038616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5038736Z output = model(*input) 2022-11-23T02:48:20.5039051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5039190Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5039575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5039752Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5040124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5040249Z _lazy_init(state, module) 2022-11-23T02:48:20.5040610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:48:20.5040751Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5041094Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5041204Z return func(*args, **kwargs) 2022-11-23T02:48:20.5041585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5041767Z p_assert( 2022-11-23T02:48:20.5042109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5042236Z traceback.print_stack() 2022-11-23T02:48:20.5042473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5042715Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5042954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5043173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5043402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5043625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5043852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5044089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5044312Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5044541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5044816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5045043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5045270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5045493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5045722Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5045946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5046180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5046407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5046634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5046848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5047073Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5047292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5047522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5047746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5047976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5048206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5048429Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5048639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5048870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5049095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5049316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5049537Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5049760Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5050043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5050263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5050483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5050696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5050922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5051146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5051368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5051591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5051816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5052047Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5052270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5052480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5052748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5052980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5053204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5053308Z dist init r=1, world=2 2022-11-23T02:48:20.5053637Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5053959Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5054276Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5054588Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5054897Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5055185Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5055489Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5055793Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5056095Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5056400Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5056703Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5057010Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5057184Z dist init r=0, world=2 2022-11-23T02:48:20.5057514Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5057832Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5058147Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5058439Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5058746Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5059051Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5059353Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5059705Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5060016Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5060315Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5060617Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5060926Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5061034Z ok (6.114s) 2022-11-23T02:48:20.5061370Z test_nested_wrapped_model_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90311 2022-11-23T02:48:20.5061578Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90312 2022-11-23T02:48:20.5061968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5062143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5062528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5062726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5063102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5063277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5063662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5063857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5064091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5064336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5064745Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5065219Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5065453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5065681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5065940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5066178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5067205Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5067328Z warnings.warn( 2022-11-23T02:48:20.5068393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5068500Z warnings.warn( 2022-11-23T02:48:20.5068735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5068964Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5069197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5069435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5069662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5069889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5070117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5070334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5070565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5070793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5071022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5071243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5071470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5071690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5071917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5072144Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5072357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5072584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5072809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5073035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5073258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5073543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5073767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5073991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5074205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5074472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5074694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5074918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5075316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5075550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5075775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5075999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5076300Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5076522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5076742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5076964Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5077190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5077411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5077641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5077863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5078088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5078301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5079075Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5079825Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5080062Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5080287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5080516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5080745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5080859Z dist init r=0, world=2 2022-11-23T02:48:20.5080971Z dist init r=1, world=2 2022-11-23T02:48:20.5081056Z ok (5.012s) 2022-11-23T02:48:20.5081386Z test_nested_wrapped_model_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90394 2022-11-23T02:48:20.5081608Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90395 2022-11-23T02:48:20.5082087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5082260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5082644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5082839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5083213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5083386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5083754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5083948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5084195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5084435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5084838Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5085291Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5085531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5085759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5085990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5086207Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5087242Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5087358Z warnings.warn( 2022-11-23T02:48:20.5088375Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5088492Z warnings.warn( 2022-11-23T02:48:20.5088727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5088958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5089188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5089426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5089654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5089876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5090089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5090320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5090548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5090843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5091066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5091286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5091512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5091742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5091954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5092182Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5092401Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5092631Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5092857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5093079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5093349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5093584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5093806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5094014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5094238Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5094459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5094686Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5094905Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5095128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5095351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5095571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5095779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5096003Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5096223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5096442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5096670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5096895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5097116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5097346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5097558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5097783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5098004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5098227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5098454Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5098739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5098957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5099070Z dist init r=1, world=2 2022-11-23T02:48:20.5099166Z dist init r=0, world=2 2022-11-23T02:48:20.5099269Z ok (5.312s) 2022-11-23T02:48:20.5099613Z test_nested_wrapped_model_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90477 2022-11-23T02:48:20.5099831Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90478 2022-11-23T02:48:20.5100220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5100392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5100778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5100979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5101352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5101564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5101965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5102152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5102394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5102635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5103040Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5103447Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5103682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5103901Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5104136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5104368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5105389Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5105506Z warnings.warn( 2022-11-23T02:48:20.5106518Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5106627Z warnings.warn( 2022-11-23T02:48:20.5106860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5107091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5107379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5107608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5107822Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5108055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5108279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5108508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5108733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5108957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5109180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5109412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5109641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5109852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5110128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5110368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5110593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5110820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5111046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5111272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5111503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5111717Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5111945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5112175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5112399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5112621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5112843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5113064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5113284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5113500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5113718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5113942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5114170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5114392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5114611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5114830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5115229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5115549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5115761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5115980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5116203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5116430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5116651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5116872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5117095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5117314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5117418Z dist init r=1, world=2 2022-11-23T02:48:20.5117530Z dist init r=0, world=2 2022-11-23T02:48:20.5117627Z ok (5.313s) 2022-11-23T02:48:20.5117964Z test_nested_wrapped_model_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90560 2022-11-23T02:48:20.5118297Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90561 2022-11-23T02:48:20.5118701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5118877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5119262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5119439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5119812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5119991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5120371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5120567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5120819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5121068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5121464Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5121863Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5122081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5122312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5122547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5122780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5123805Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5123915Z warnings.warn( 2022-11-23T02:48:20.5124929Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5125105Z warnings.warn( 2022-11-23T02:48:20.5125237Z File "", line 1, in 2022-11-23T02:48:20.5125454Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5125597Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5125786Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5125939Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5126152Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5126256Z self.run() 2022-11-23T02:48:20.5126456Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5126598Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5126951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5127071Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5127486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5127615Z getattr(self, test_name)() 2022-11-23T02:48:20.5127986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5128083Z fn() 2022-11-23T02:48:20.5128452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5128577Z test(self, **param_kwargs) 2022-11-23T02:48:20.5128948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5129059Z return func(*args, **kwargs) 2022-11-23T02:48:20.5129313Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5129424Z self.run_subtests( 2022-11-23T02:48:20.5129782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5129951Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5130322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5130475Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5130852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5130962Z output = model(*input) 2022-11-23T02:48:20.5131294Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5131434Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5131822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5132006Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5132372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5132491Z _lazy_init(state, module) 2022-11-23T02:48:20.5132847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5132976Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5133320Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5133508Z return func(*args, **kwargs) 2022-11-23T02:48:20.5133898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5133998Z p_assert( 2022-11-23T02:48:20.5134343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5134466Z traceback.print_stack() 2022-11-23T02:48:20.5134590Z File "", line 1, in 2022-11-23T02:48:20.5134787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5134927Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5135128Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5135279Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5135489Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5135593Z self.run() 2022-11-23T02:48:20.5135789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5135919Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5136266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5136446Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5136828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5136954Z getattr(self, test_name)() 2022-11-23T02:48:20.5137315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5137407Z fn() 2022-11-23T02:48:20.5137778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5137886Z test(self, **param_kwargs) 2022-11-23T02:48:20.5138252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5138381Z return func(*args, **kwargs) 2022-11-23T02:48:20.5138638Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5138744Z self.run_subtests( 2022-11-23T02:48:20.5139103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5139266Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5139633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5139773Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5140149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5140272Z output = model(*input) 2022-11-23T02:48:20.5140600Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5140743Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5141127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5141312Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5141679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5141786Z _lazy_init(state, module) 2022-11-23T02:48:20.5142141Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5142281Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5142621Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5142821Z return func(*args, **kwargs) 2022-11-23T02:48:20.5143208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5143307Z p_assert( 2022-11-23T02:48:20.5143646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5143760Z traceback.print_stack() 2022-11-23T02:48:20.5143992Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5144223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5144347Z File "", line 1, in 2022-11-23T02:48:20.5144554Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5144693Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5144891Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5145049Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5145249Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5145353Z self.run() 2022-11-23T02:48:20.5145552Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5145740Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5146093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5146224Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5146589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5146709Z getattr(self, test_name)() 2022-11-23T02:48:20.5147057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5147154Z fn() 2022-11-23T02:48:20.5147518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5147641Z test(self, **param_kwargs) 2022-11-23T02:48:20.5148005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5148125Z return func(*args, **kwargs) 2022-11-23T02:48:20.5148379Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5148478Z self.run_subtests( 2022-11-23T02:48:20.5148837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5149003Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5149370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5149529Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5149910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5150023Z output = model(*input) 2022-11-23T02:48:20.5150356Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5150500Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5150871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5151046Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5151410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5151527Z _lazy_init(state, module) 2022-11-23T02:48:20.5151883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5152083Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5152431Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5152549Z return func(*args, **kwargs) 2022-11-23T02:48:20.5152924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5153025Z p_assert( 2022-11-23T02:48:20.5153368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5153496Z traceback.print_stack() 2022-11-23T02:48:20.5153620Z File "", line 1, in 2022-11-23T02:48:20.5153823Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5153957Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5154149Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5154296Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5154506Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5154603Z self.run() 2022-11-23T02:48:20.5154804Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5155001Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5155603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5155734Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5156088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5156211Z getattr(self, test_name)() 2022-11-23T02:48:20.5156572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5156675Z fn() 2022-11-23T02:48:20.5157044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5157164Z test(self, **param_kwargs) 2022-11-23T02:48:20.5157523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5157652Z return func(*args, **kwargs) 2022-11-23T02:48:20.5157895Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5158004Z self.run_subtests( 2022-11-23T02:48:20.5158362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5158528Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5158896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5159055Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5159430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5159552Z output = model(*input) 2022-11-23T02:48:20.5159876Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5160017Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5160396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5160571Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5160943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5161060Z _lazy_init(state, module) 2022-11-23T02:48:20.5161511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5161653Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5161982Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5162104Z return func(*args, **kwargs) 2022-11-23T02:48:20.5162492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5162594Z p_assert( 2022-11-23T02:48:20.5162937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5163064Z traceback.print_stack() 2022-11-23T02:48:20.5163298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5163532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5163653Z File "", line 1, in 2022-11-23T02:48:20.5163866Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5164008Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5164209Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5164356Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5164633Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5164744Z self.run() 2022-11-23T02:48:20.5164937Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5165083Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5165431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5165563Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5165936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5166066Z getattr(self, test_name)() 2022-11-23T02:48:20.5166430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5166525Z fn() 2022-11-23T02:48:20.5166887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5167006Z test(self, **param_kwargs) 2022-11-23T02:48:20.5167360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5167484Z return func(*args, **kwargs) 2022-11-23T02:48:20.5167739Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5167850Z self.run_subtests( 2022-11-23T02:48:20.5168204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5168367Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5168722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5168872Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5169261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5169383Z output = model(*input) 2022-11-23T02:48:20.5169707Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5169850Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5170233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5170405Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5170830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5170947Z _lazy_init(state, module) 2022-11-23T02:48:20.5171302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5171444Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5171789Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5171918Z return func(*args, **kwargs) 2022-11-23T02:48:20.5172298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5172400Z p_assert( 2022-11-23T02:48:20.5172726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5172853Z traceback.print_stack() 2022-11-23T02:48:20.5173036Z File "", line 1, in 2022-11-23T02:48:20.5173245Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5173382Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5173580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5173733Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5174006Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5174106Z self.run() 2022-11-23T02:48:20.5174308Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5174490Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5174839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5174968Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5175330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5175459Z getattr(self, test_name)() 2022-11-23T02:48:20.5175807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5175909Z fn() 2022-11-23T02:48:20.5176280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5176405Z test(self, **param_kwargs) 2022-11-23T02:48:20.5176766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5176888Z return func(*args, **kwargs) 2022-11-23T02:48:20.5177136Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5177240Z self.run_subtests( 2022-11-23T02:48:20.5177580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5177744Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5178113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5178265Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5178652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5178767Z output = model(*input) 2022-11-23T02:48:20.5179091Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5179232Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5179599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5179776Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5180216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5180337Z _lazy_init(state, module) 2022-11-23T02:48:20.5180690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5180836Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5181178Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5181302Z return func(*args, **kwargs) 2022-11-23T02:48:20.5181670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5181766Z p_assert( 2022-11-23T02:48:20.5182101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5182225Z traceback.print_stack() 2022-11-23T02:48:20.5182461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5182693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5182822Z File "", line 1, in 2022-11-23T02:48:20.5183029Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5183207Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5183418Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5183562Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5183769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5183873Z self.run() 2022-11-23T02:48:20.5184071Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5184212Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5184566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5184683Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5185050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5185176Z getattr(self, test_name)() 2022-11-23T02:48:20.5185540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5185637Z fn() 2022-11-23T02:48:20.5186007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5186130Z test(self, **param_kwargs) 2022-11-23T02:48:20.5186488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5186599Z return func(*args, **kwargs) 2022-11-23T02:48:20.5186861Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5186970Z self.run_subtests( 2022-11-23T02:48:20.5187327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5187485Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5187853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5188015Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5188395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5188501Z output = model(*input) 2022-11-23T02:48:20.5188831Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5188971Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5189416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5189590Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5189958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5190083Z _lazy_init(state, module) 2022-11-23T02:48:20.5190435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5190564Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5190901Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5191023Z return func(*args, **kwargs) 2022-11-23T02:48:20.5191403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5191507Z p_assert( 2022-11-23T02:48:20.5191849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5191974Z traceback.print_stack() 2022-11-23T02:48:20.5192088Z File "", line 1, in 2022-11-23T02:48:20.5192346Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5192497Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5192701Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5192845Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5193053Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5193152Z self.run() 2022-11-23T02:48:20.5193354Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5193484Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5193836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5193964Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5194330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5194452Z getattr(self, test_name)() 2022-11-23T02:48:20.5194817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5194916Z fn() 2022-11-23T02:48:20.5195516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5195629Z test(self, **param_kwargs) 2022-11-23T02:48:20.5195995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5196116Z return func(*args, **kwargs) 2022-11-23T02:48:20.5196376Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5196489Z self.run_subtests( 2022-11-23T02:48:20.5196847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5197016Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5197383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5197524Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5197903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5198025Z output = model(*input) 2022-11-23T02:48:20.5198353Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5198590Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5198977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5199157Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5199533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5199642Z _lazy_init(state, module) 2022-11-23T02:48:20.5199995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5200140Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5200487Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5200614Z return func(*args, **kwargs) 2022-11-23T02:48:20.5201001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5201108Z p_assert( 2022-11-23T02:48:20.5201446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5201559Z traceback.print_stack() 2022-11-23T02:48:20.5201796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5202099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5202238Z File "", line 1, in 2022-11-23T02:48:20.5202449Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5202593Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5202790Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5202935Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5203136Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5203239Z self.run() 2022-11-23T02:48:20.5203442Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5203588Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5203934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5204065Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5204427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5204534Z getattr(self, test_name)() 2022-11-23T02:48:20.5204897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5204992Z fn() 2022-11-23T02:48:20.5205362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5205485Z test(self, **param_kwargs) 2022-11-23T02:48:20.5205840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5205965Z return func(*args, **kwargs) 2022-11-23T02:48:20.5206221Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5206324Z self.run_subtests( 2022-11-23T02:48:20.5206682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5206843Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5207207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5207359Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5207739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5207945Z output = model(*input) 2022-11-23T02:48:20.5208280Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5208406Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5208788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5208965Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5209335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5209455Z _lazy_init(state, module) 2022-11-23T02:48:20.5209804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5209949Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5210290Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5210405Z return func(*args, **kwargs) 2022-11-23T02:48:20.5210784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5210885Z p_assert( 2022-11-23T02:48:20.5211278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5211414Z traceback.print_stack() 2022-11-23T02:48:20.5211541Z File "", line 1, in 2022-11-23T02:48:20.5211747Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5211889Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5212077Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5212227Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5212433Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5212539Z self.run() 2022-11-23T02:48:20.5212739Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5212887Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5213231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5213354Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5213720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5213842Z getattr(self, test_name)() 2022-11-23T02:48:20.5214198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5214299Z fn() 2022-11-23T02:48:20.5214664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5214786Z test(self, **param_kwargs) 2022-11-23T02:48:20.5215140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5215251Z return func(*args, **kwargs) 2022-11-23T02:48:20.5215501Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5215610Z self.run_subtests( 2022-11-23T02:48:20.5215964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5216125Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5216493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5216640Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5217017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5217188Z output = model(*input) 2022-11-23T02:48:20.5217522Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5217658Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5218040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5218215Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5218591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5218707Z _lazy_init(state, module) 2022-11-23T02:48:20.5219061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5219201Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5219535Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5219659Z return func(*args, **kwargs) 2022-11-23T02:48:20.5220042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5220144Z p_assert( 2022-11-23T02:48:20.5220534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5220671Z traceback.print_stack() 2022-11-23T02:48:20.5220910Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5221134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5221264Z File "", line 1, in 2022-11-23T02:48:20.5221465Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5221607Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5221813Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5221967Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5222181Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5222282Z self.run() 2022-11-23T02:48:20.5222474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5222619Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5222962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5223091Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5223452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5223577Z getattr(self, test_name)() 2022-11-23T02:48:20.5223943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5224046Z fn() 2022-11-23T02:48:20.5224407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5224533Z test(self, **param_kwargs) 2022-11-23T02:48:20.5224895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5225022Z return func(*args, **kwargs) 2022-11-23T02:48:20.5225275Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5225391Z self.run_subtests( 2022-11-23T02:48:20.5225745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5225905Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5226265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5226483Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5226864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5226988Z output = model(*input) 2022-11-23T02:48:20.5227317Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5227459Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5241788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5242041Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5242469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5242595Z _lazy_init(state, module) 2022-11-23T02:48:20.5242983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5243130Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5243484Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5243611Z return func(*args, **kwargs) 2022-11-23T02:48:20.5244138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5244263Z p_assert( 2022-11-23T02:48:20.5244607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5244739Z traceback.print_stack() 2022-11-23T02:48:20.5244874Z File "", line 1, in 2022-11-23T02:48:20.5245091Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5245244Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5245452Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5245610Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5245811Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5245917Z self.run() 2022-11-23T02:48:20.5246125Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5246274Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5246624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5246763Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5247231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5247371Z getattr(self, test_name)() 2022-11-23T02:48:20.5247731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5247839Z fn() 2022-11-23T02:48:20.5248216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5248343Z test(self, **param_kwargs) 2022-11-23T02:48:20.5248709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5248893Z return func(*args, **kwargs) 2022-11-23T02:48:20.5249152Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5249267Z self.run_subtests( 2022-11-23T02:48:20.5249616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5249783Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5250152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5250403Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5250791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5250919Z output = model(*input) 2022-11-23T02:48:20.5251255Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5251411Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5251785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5251971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5252344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5252475Z _lazy_init(state, module) 2022-11-23T02:48:20.5252829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5252976Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5253321Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5253504Z return func(*args, **kwargs) 2022-11-23T02:48:20.5253893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5253998Z p_assert( 2022-11-23T02:48:20.5254338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5254467Z traceback.print_stack() 2022-11-23T02:48:20.5254713Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5254953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5255096Z File "", line 1, in 2022-11-23T02:48:20.5255313Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5255445Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5255654Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5255814Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5256028Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5256132Z self.run() 2022-11-23T02:48:20.5256339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5256485Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5256823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5256960Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5257338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5257463Z getattr(self, test_name)() 2022-11-23T02:48:20.5257832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5257932Z fn() 2022-11-23T02:48:20.5258313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5258441Z test(self, **param_kwargs) 2022-11-23T02:48:20.5258791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5258917Z return func(*args, **kwargs) 2022-11-23T02:48:20.5259172Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5259289Z self.run_subtests( 2022-11-23T02:48:20.5259710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5259875Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5260245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5260398Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5260773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5260896Z output = model(*input) 2022-11-23T02:48:20.5261229Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5261374Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5261758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5261949Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5262320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5262446Z _lazy_init(state, module) 2022-11-23T02:48:20.5262805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5262989Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5263345Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5263475Z return func(*args, **kwargs) 2022-11-23T02:48:20.5263860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5263960Z p_assert( 2022-11-23T02:48:20.5264305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5264443Z traceback.print_stack() 2022-11-23T02:48:20.5264561Z File "", line 1, in 2022-11-23T02:48:20.5264773Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5264920Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5265127Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5265286Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5265503Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5265607Z self.run() 2022-11-23T02:48:20.5265818Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5265955Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5266303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5266443Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5266819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5266946Z getattr(self, test_name)() 2022-11-23T02:48:20.5267312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5267412Z fn() 2022-11-23T02:48:20.5267789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5267901Z test(self, **param_kwargs) 2022-11-23T02:48:20.5268263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5268389Z return func(*args, **kwargs) 2022-11-23T02:48:20.5268645Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5268765Z self.run_subtests( 2022-11-23T02:48:20.5269251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5269419Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5269892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5270047Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5270641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5277550Z output = model(*input) 2022-11-23T02:48:20.5277950Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5278083Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5278459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5278644Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5279012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5279121Z _lazy_init(state, module) 2022-11-23T02:48:20.5279591Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5279748Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5280091Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5280213Z return func(*args, **kwargs) 2022-11-23T02:48:20.5280590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5280690Z p_assert( 2022-11-23T02:48:20.5281029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5281147Z traceback.print_stack() 2022-11-23T02:48:20.5281383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5281617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5281742Z File "", line 1, in 2022-11-23T02:48:20.5281956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5282104Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5282304Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5282440Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5282651Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5282750Z self.run() 2022-11-23T02:48:20.5282949Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5283096Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5283443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5283580Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5283944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5284058Z getattr(self, test_name)() 2022-11-23T02:48:20.5284421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5284518Z fn() 2022-11-23T02:48:20.5284882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5285002Z test(self, **param_kwargs) 2022-11-23T02:48:20.5285358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5285600Z return func(*args, **kwargs) 2022-11-23T02:48:20.5306686Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5306827Z self.run_subtests( 2022-11-23T02:48:20.5307215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5307380Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5307750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5307900Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5308279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5308397Z output = model(*input) 2022-11-23T02:48:20.5308726Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5308859Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5309240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5309412Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5309867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5309995Z _lazy_init(state, module) 2022-11-23T02:48:20.5310357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5310500Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5310841Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5310953Z return func(*args, **kwargs) 2022-11-23T02:48:20.5311333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5311438Z p_assert( 2022-11-23T02:48:20.5311778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5311900Z traceback.print_stack() 2022-11-23T02:48:20.5312025Z File "", line 1, in 2022-11-23T02:48:20.5312238Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5312378Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5312567Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5312713Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5312923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5313022Z self.run() 2022-11-23T02:48:20.5313222Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5313369Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5313713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5313833Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5314197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5314321Z getattr(self, test_name)() 2022-11-23T02:48:20.5314683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5314778Z fn() 2022-11-23T02:48:20.5315446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5315573Z test(self, **param_kwargs) 2022-11-23T02:48:20.5315945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5316170Z return func(*args, **kwargs) 2022-11-23T02:48:20.5316426Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5316535Z self.run_subtests( 2022-11-23T02:48:20.5316895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5317062Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5317430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5317580Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5317957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5318064Z output = model(*input) 2022-11-23T02:48:20.5318390Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5318532Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5318909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5319085Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5319523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5319652Z _lazy_init(state, module) 2022-11-23T02:48:20.5320011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5320140Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5320477Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5320600Z return func(*args, **kwargs) 2022-11-23T02:48:20.5320981Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5321087Z p_assert( 2022-11-23T02:48:20.5321427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5321551Z traceback.print_stack() 2022-11-23T02:48:20.5321792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5322015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5322140Z File "", line 1, in 2022-11-23T02:48:20.5322347Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5322487Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5322685Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5322833Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5323047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5323148Z self.run() 2022-11-23T02:48:20.5323339Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5323480Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5323832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5323963Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5324327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5324447Z getattr(self, test_name)() 2022-11-23T02:48:20.5324808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5324903Z fn() 2022-11-23T02:48:20.5325257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5325449Z test(self, **param_kwargs) 2022-11-23T02:48:20.5325813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5325934Z return func(*args, **kwargs) 2022-11-23T02:48:20.5326191Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5326301Z self.run_subtests( 2022-11-23T02:48:20.5326650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5326814Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5327185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5327327Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5327711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5327838Z output = model(*input) 2022-11-23T02:48:20.5328174Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5328321Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5328760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5328958Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5329341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5329451Z _lazy_init(state, module) 2022-11-23T02:48:20.5329809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5329958Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5330312Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5330443Z return func(*args, **kwargs) 2022-11-23T02:48:20.5330828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5330936Z p_assert( 2022-11-23T02:48:20.5331284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5331397Z traceback.print_stack() 2022-11-23T02:48:20.5331532Z File "", line 1, in 2022-11-23T02:48:20.5331744Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5331888Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5332090Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5332246Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5332469Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5332570Z self.run() 2022-11-23T02:48:20.5332760Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5332910Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5333258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5333398Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5333775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5333900Z getattr(self, test_name)() 2022-11-23T02:48:20.5334269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5334354Z fn() 2022-11-23T02:48:20.5334724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5334920Z test(self, **param_kwargs) 2022-11-23T02:48:20.5335290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5335424Z return func(*args, **kwargs) 2022-11-23T02:48:20.5335689Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5335807Z self.run_subtests( 2022-11-23T02:48:20.5336168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5336317Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5336691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5336849Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5337235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5337356Z output = model(*input) 2022-11-23T02:48:20.5337690Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5337837Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5338277Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5338453Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5338834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5338965Z _lazy_init(state, module) 2022-11-23T02:48:20.5339324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5339473Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5339827Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5339959Z return func(*args, **kwargs) 2022-11-23T02:48:20.5340347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5340436Z p_assert( 2022-11-23T02:48:20.5340783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5340905Z traceback.print_stack() 2022-11-23T02:48:20.5341147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5341387Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5341513Z File "", line 1, in 2022-11-23T02:48:20.5341728Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5341881Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5342069Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5342221Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5342436Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5342544Z self.run() 2022-11-23T02:48:20.5342752Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5342901Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5343251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5343386Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5343739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5343858Z getattr(self, test_name)() 2022-11-23T02:48:20.5344304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5344406Z fn() 2022-11-23T02:48:20.5344775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5344903Z test(self, **param_kwargs) 2022-11-23T02:48:20.5345273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5345402Z return func(*args, **kwargs) 2022-11-23T02:48:20.5345642Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5345756Z self.run_subtests( 2022-11-23T02:48:20.5346121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5346289Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5346661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5346822Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5347206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5347331Z output = model(*input) 2022-11-23T02:48:20.5347699Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5347855Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5348247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5348429Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5348805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5348935Z _lazy_init(state, module) 2022-11-23T02:48:20.5349297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5349446Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5349772Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5349904Z return func(*args, **kwargs) 2022-11-23T02:48:20.5350292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5350400Z p_assert( 2022-11-23T02:48:20.5350747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5350878Z traceback.print_stack() 2022-11-23T02:48:20.5351008Z File "", line 1, in 2022-11-23T02:48:20.5351203Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5351355Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5351558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5351713Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5351928Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5352034Z self.run() 2022-11-23T02:48:20.5352244Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5352394Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5352730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5352863Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5353234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5353364Z getattr(self, test_name)() 2022-11-23T02:48:20.5353803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5353907Z fn() 2022-11-23T02:48:20.5354274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5354404Z test(self, **param_kwargs) 2022-11-23T02:48:20.5354756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5354888Z return func(*args, **kwargs) 2022-11-23T02:48:20.5355329Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5355456Z self.run_subtests( 2022-11-23T02:48:20.5355828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5355998Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5356379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5356540Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5356905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5357115Z output = model(*input) 2022-11-23T02:48:20.5357468Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5357616Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5358000Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5358178Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5358551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5358684Z _lazy_init(state, module) 2022-11-23T02:48:20.5359027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5359176Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5359527Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5359663Z return func(*args, **kwargs) 2022-11-23T02:48:20.5360054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5360162Z p_assert( 2022-11-23T02:48:20.5360509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5360639Z traceback.print_stack() 2022-11-23T02:48:20.5360864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5361109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5361242Z File "", line 1, in 2022-11-23T02:48:20.5361458Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5361604Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5361815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5361967Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5362166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5362273Z self.run() 2022-11-23T02:48:20.5362476Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5362624Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5362977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5363205Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5363578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5363707Z getattr(self, test_name)() 2022-11-23T02:48:20.5364057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5364165Z fn() 2022-11-23T02:48:20.5364532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5364656Z test(self, **param_kwargs) 2022-11-23T02:48:20.5365023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5365153Z return func(*args, **kwargs) 2022-11-23T02:48:20.5365416Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5365535Z self.run_subtests( 2022-11-23T02:48:20.5365880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5366049Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5366420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5366632Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5367030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5367154Z output = model(*input) 2022-11-23T02:48:20.5367488Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5367635Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5368002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5368192Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5368567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5368695Z _lazy_init(state, module) 2022-11-23T02:48:20.5369060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5369206Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5369547Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5369675Z return func(*args, **kwargs) 2022-11-23T02:48:20.5370045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5370153Z p_assert( 2022-11-23T02:48:20.5370494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5370624Z traceback.print_stack() 2022-11-23T02:48:20.5370750Z File "", line 1, in 2022-11-23T02:48:20.5370963Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5371109Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5371316Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5371453Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5371669Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5371776Z self.run() 2022-11-23T02:48:20.5371982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5372136Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5372492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5372712Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5373112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5373241Z getattr(self, test_name)() 2022-11-23T02:48:20.5373613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5373719Z fn() 2022-11-23T02:48:20.5374090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5374215Z test(self, **param_kwargs) 2022-11-23T02:48:20.5374581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5374714Z return func(*args, **kwargs) 2022-11-23T02:48:20.5374954Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5375076Z self.run_subtests( 2022-11-23T02:48:20.5375436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5375599Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5375979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5376194Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5376590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5376714Z output = model(*input) 2022-11-23T02:48:20.5377030Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5377173Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5377554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5377741Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5378119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5378244Z _lazy_init(state, module) 2022-11-23T02:48:20.5378604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5378753Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5379083Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5379212Z return func(*args, **kwargs) 2022-11-23T02:48:20.5379599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5379709Z p_assert( 2022-11-23T02:48:20.5380055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5380182Z traceback.print_stack() 2022-11-23T02:48:20.5380420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5380660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5380776Z File "", line 1, in 2022-11-23T02:48:20.5380993Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5381133Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5381340Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5381495Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5381712Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5381818Z self.run() 2022-11-23T02:48:20.5382025Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5382221Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5382578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5382716Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5383091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5383220Z getattr(self, test_name)() 2022-11-23T02:48:20.5383589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5383690Z fn() 2022-11-23T02:48:20.5384065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5384173Z test(self, **param_kwargs) 2022-11-23T02:48:20.5384539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5384673Z return func(*args, **kwargs) 2022-11-23T02:48:20.5384935Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5385055Z self.run_subtests( 2022-11-23T02:48:20.5385471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5385648Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5386022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5386160Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5386542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5386667Z output = model(*input) 2022-11-23T02:48:20.5387002Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5387153Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5387540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5387723Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5388100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5388208Z _lazy_init(state, module) 2022-11-23T02:48:20.5388574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5388716Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5389061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5389193Z return func(*args, **kwargs) 2022-11-23T02:48:20.5389589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5389696Z p_assert( 2022-11-23T02:48:20.5390040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5390152Z traceback.print_stack() 2022-11-23T02:48:20.5390286Z File "", line 1, in 2022-11-23T02:48:20.5390499Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5390643Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5390848Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5391004Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5391221Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5391309Z self.run() 2022-11-23T02:48:20.5391517Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5391737Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5392094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5392229Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5392601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5392730Z getattr(self, test_name)() 2022-11-23T02:48:20.5393098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5393182Z fn() 2022-11-23T02:48:20.5393551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5393678Z test(self, **param_kwargs) 2022-11-23T02:48:20.5394045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5394179Z return func(*args, **kwargs) 2022-11-23T02:48:20.5394440Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5394560Z self.run_subtests( 2022-11-23T02:48:20.5394981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5395374Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5395767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5395928Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5396315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5396441Z output = model(*input) 2022-11-23T02:48:20.5396779Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5396932Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5397320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5397484Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5397866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5397993Z _lazy_init(state, module) 2022-11-23T02:48:20.5398353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5398500Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5398850Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5398982Z return func(*args, **kwargs) 2022-11-23T02:48:20.5399379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5399468Z p_assert( 2022-11-23T02:48:20.5399814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5399940Z traceback.print_stack() 2022-11-23T02:48:20.5400188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5400424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5400661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5400897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5401134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5401345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5401676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5401908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5402142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5402375Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5402608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5402839Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5403070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5403280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5404055Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5404877Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5405649Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5406412Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5407165Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5407902Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5408653Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5409407Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5409652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5409892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5410127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5410422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5410657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5410887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5411126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5411340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5411569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5411797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5412026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5412256Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5412491Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5412716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5412937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5413199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5413443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5413670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5413898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5414122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5414351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5414587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5414813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5415045Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5415794Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5416544Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5417301Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5418055Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5418796Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5419602Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5420345Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5421080Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5421330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5421567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5421804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5422091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5422335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5422564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5422796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5423028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5423240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5423475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5423588Z dist init r=1, world=2 2022-11-23T02:48:20.5423930Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5424255Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5424568Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5424877Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5425191Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5425499Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5425806Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5426095Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5426395Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5426694Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5427064Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5427371Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5427486Z dist init r=0, world=2 2022-11-23T02:48:20.5427814Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5428137Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5428448Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5428762Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5429114Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5429415Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5429723Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5430027Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5430339Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5430645Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5430954Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5431259Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5431359Z ok (5.413s) 2022-11-23T02:48:20.5431691Z test_nested_wrapped_model_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90643 2022-11-23T02:48:20.5431920Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90644 2022-11-23T02:48:20.5432314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5432480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5432872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5433073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5433451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5433630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5434017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5434291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5434539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5434770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5435369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5435791Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5436024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5436251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5436494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5436737Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5437905Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5438034Z warnings.warn( 2022-11-23T02:48:20.5439062Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5439180Z warnings.warn( 2022-11-23T02:48:20.5439296Z File "", line 1, in 2022-11-23T02:48:20.5439516Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5439660Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5439872Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5440030Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5440250Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5440360Z self.run() 2022-11-23T02:48:20.5440566Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5440701Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5441057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5441197Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5441567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5441698Z getattr(self, test_name)() 2022-11-23T02:48:20.5442074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5442174Z fn() 2022-11-23T02:48:20.5442551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5442661Z test(self, **param_kwargs) 2022-11-23T02:48:20.5443029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5443160Z return func(*args, **kwargs) 2022-11-23T02:48:20.5443417Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5443620Z self.run_subtests( 2022-11-23T02:48:20.5443992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5444159Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5444539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5444683Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5445071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5445197Z output = model(*input) 2022-11-23T02:48:20.5445530Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5445670Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5446049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5446236Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5446612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5446719Z _lazy_init(state, module) 2022-11-23T02:48:20.5447136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5447296Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5447649Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5447778Z return func(*args, **kwargs) 2022-11-23T02:48:20.5448166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5448270Z p_assert( 2022-11-23T02:48:20.5448610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5448726Z traceback.print_stack() 2022-11-23T02:48:20.5448857Z File "", line 1, in 2022-11-23T02:48:20.5449076Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5449223Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5449430Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5449578Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5449795Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5449882Z self.run() 2022-11-23T02:48:20.5450085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5450231Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5450582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5450726Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5451098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5451225Z getattr(self, test_name)() 2022-11-23T02:48:20.5451598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5451683Z fn() 2022-11-23T02:48:20.5452057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5452175Z test(self, **param_kwargs) 2022-11-23T02:48:20.5452541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5452667Z return func(*args, **kwargs) 2022-11-23T02:48:20.5452930Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5453113Z self.run_subtests( 2022-11-23T02:48:20.5453483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5453631Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5454008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5454158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5454542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5454664Z output = model(*input) 2022-11-23T02:48:20.5454997Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5455140Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5455523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5455690Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5456062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5456182Z _lazy_init(state, module) 2022-11-23T02:48:20.5456601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5456760Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5457111Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5457242Z return func(*args, **kwargs) 2022-11-23T02:48:20.5457626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5457717Z p_assert( 2022-11-23T02:48:20.5458066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5458190Z traceback.print_stack() 2022-11-23T02:48:20.5458427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5458669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5458806Z File "", line 1, in 2022-11-23T02:48:20.5459018Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5459157Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5459344Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5459497Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5459711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5459820Z self.run() 2022-11-23T02:48:20.5460031Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5460175Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5460522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5460642Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5461020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5461144Z getattr(self, test_name)() 2022-11-23T02:48:20.5461514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5461614Z fn() 2022-11-23T02:48:20.5461989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5462112Z test(self, **param_kwargs) 2022-11-23T02:48:20.5462474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5462653Z return func(*args, **kwargs) 2022-11-23T02:48:20.5462911Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5463029Z self.run_subtests( 2022-11-23T02:48:20.5463399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5463559Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5463935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5464093Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5464475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5464580Z output = model(*input) 2022-11-23T02:48:20.5464916Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5465059Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5465446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5465674Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5466059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5466187Z _lazy_init(state, module) 2022-11-23T02:48:20.5466549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5466678Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5467024Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5467158Z return func(*args, **kwargs) 2022-11-23T02:48:20.5467550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5467654Z p_assert( 2022-11-23T02:48:20.5468003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5468136Z traceback.print_stack() 2022-11-23T02:48:20.5468267Z File "", line 1, in 2022-11-23T02:48:20.5468464Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5468605Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5468810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5468957Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5469166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5469275Z self.run() 2022-11-23T02:48:20.5469482Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5469626Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5469957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5470094Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5470468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5470593Z getattr(self, test_name)() 2022-11-23T02:48:20.5470959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5471065Z fn() 2022-11-23T02:48:20.5471438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5471548Z test(self, **param_kwargs) 2022-11-23T02:48:20.5471916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5472113Z return func(*args, **kwargs) 2022-11-23T02:48:20.5472379Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5472491Z self.run_subtests( 2022-11-23T02:48:20.5472863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5473027Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5473442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5473583Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5473965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5474084Z output = model(*input) 2022-11-23T02:48:20.5474423Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5474564Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5474950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5475396Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5475798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5475920Z _lazy_init(state, module) 2022-11-23T02:48:20.5476264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5476407Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5476749Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5476881Z return func(*args, **kwargs) 2022-11-23T02:48:20.5477273Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5477380Z p_assert( 2022-11-23T02:48:20.5477725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5477842Z traceback.print_stack() 2022-11-23T02:48:20.5478088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5478326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5478457Z File "", line 1, in 2022-11-23T02:48:20.5478676Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5478822Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5479025Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5479182Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5479381Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5479486Z self.run() 2022-11-23T02:48:20.5479691Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5479842Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5480200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5480340Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5480710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5480835Z getattr(self, test_name)() 2022-11-23T02:48:20.5481184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5481285Z fn() 2022-11-23T02:48:20.5481745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5481874Z test(self, **param_kwargs) 2022-11-23T02:48:20.5482235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5482364Z return func(*args, **kwargs) 2022-11-23T02:48:20.5482625Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5482744Z self.run_subtests( 2022-11-23T02:48:20.5483085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5483249Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5483623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5483784Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5484170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5484296Z output = model(*input) 2022-11-23T02:48:20.5484626Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5484822Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5485207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5485383Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5485756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5485878Z _lazy_init(state, module) 2022-11-23T02:48:20.5486241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5486395Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5486744Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5486868Z return func(*args, **kwargs) 2022-11-23T02:48:20.5487240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5487345Z p_assert( 2022-11-23T02:48:20.5487690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5487818Z traceback.print_stack() 2022-11-23T02:48:20.5487952Z File "", line 1, in 2022-11-23T02:48:20.5488164Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5488311Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5488502Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5488654Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5488875Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5488986Z self.run() 2022-11-23T02:48:20.5489188Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5489334Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5489682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5489816Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5490169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5490293Z getattr(self, test_name)() 2022-11-23T02:48:20.5490659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5490827Z fn() 2022-11-23T02:48:20.5491206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5491331Z test(self, **param_kwargs) 2022-11-23T02:48:20.5491688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5491816Z return func(*args, **kwargs) 2022-11-23T02:48:20.5492058Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5492172Z self.run_subtests( 2022-11-23T02:48:20.5492529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5492688Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5493062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5493228Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5493616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5493737Z output = model(*input) 2022-11-23T02:48:20.5494056Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5494252Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5494650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5494829Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5495202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5495322Z _lazy_init(state, module) 2022-11-23T02:48:20.5495679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5495829Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5496159Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5496289Z return func(*args, **kwargs) 2022-11-23T02:48:20.5496680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5496788Z p_assert( 2022-11-23T02:48:20.5497126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5497257Z traceback.print_stack() 2022-11-23T02:48:20.5497500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5497742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5497858Z File "", line 1, in 2022-11-23T02:48:20.5498075Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5498223Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5498429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5498583Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5498804Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5498909Z self.run() 2022-11-23T02:48:20.5499097Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5499242Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5499593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5499728Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5500095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5500301Z getattr(self, test_name)() 2022-11-23T02:48:20.5500675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5500775Z fn() 2022-11-23T02:48:20.5501131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5501259Z test(self, **param_kwargs) 2022-11-23T02:48:20.5501625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5501752Z return func(*args, **kwargs) 2022-11-23T02:48:20.5502010Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5502125Z self.run_subtests( 2022-11-23T02:48:20.5502485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5502656Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5503013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5503172Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5503604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5503736Z output = model(*input) 2022-11-23T02:48:20.5504078Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5504227Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5504616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5504795Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5505153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5505285Z _lazy_init(state, module) 2022-11-23T02:48:20.5505650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5505802Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5506157Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5506290Z return func(*args, **kwargs) 2022-11-23T02:48:20.5506686Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5506793Z p_assert( 2022-11-23T02:48:20.5507119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5507247Z traceback.print_stack() 2022-11-23T02:48:20.5507373Z File "", line 1, in 2022-11-23T02:48:20.5507586Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5507732Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5507940Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5508093Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5508316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5508404Z self.run() 2022-11-23T02:48:20.5508614Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5508766Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5509117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5509256Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5509628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5509828Z getattr(self, test_name)() 2022-11-23T02:48:20.5510201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5510284Z fn() 2022-11-23T02:48:20.5510658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5510786Z test(self, **param_kwargs) 2022-11-23T02:48:20.5511156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5511285Z return func(*args, **kwargs) 2022-11-23T02:48:20.5511545Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5511659Z self.run_subtests( 2022-11-23T02:48:20.5512005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5512176Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5512544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5512706Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5513147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5513281Z output = model(*input) 2022-11-23T02:48:20.5513621Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5513766Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5514131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5514314Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5514695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5514822Z _lazy_init(state, module) 2022-11-23T02:48:20.5515417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5515571Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5515934Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5516061Z return func(*args, **kwargs) 2022-11-23T02:48:20.5516451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5516538Z p_assert( 2022-11-23T02:48:20.5516886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5517019Z traceback.print_stack() 2022-11-23T02:48:20.5517264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5517514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5517651Z File "", line 1, in 2022-11-23T02:48:20.5517869Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5518002Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5518209Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5518366Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5518583Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5518687Z self.run() 2022-11-23T02:48:20.5518893Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5519039Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5519391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5519606Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5519982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5520111Z getattr(self, test_name)() 2022-11-23T02:48:20.5520485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5520586Z fn() 2022-11-23T02:48:20.5520963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5521088Z test(self, **param_kwargs) 2022-11-23T02:48:20.5521459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5521572Z return func(*args, **kwargs) 2022-11-23T02:48:20.5521833Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5521947Z self.run_subtests( 2022-11-23T02:48:20.5522309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5522477Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5522918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5523086Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5523473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5523578Z output = model(*input) 2022-11-23T02:48:20.5523909Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5524056Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5524449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5524631Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5525009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5525139Z _lazy_init(state, module) 2022-11-23T02:48:20.5525495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5525625Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5525969Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5526097Z return func(*args, **kwargs) 2022-11-23T02:48:20.5526484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5526591Z p_assert( 2022-11-23T02:48:20.5526942Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5527065Z traceback.print_stack() 2022-11-23T02:48:20.5527195Z File "", line 1, in 2022-11-23T02:48:20.5527391Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5527536Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5527744Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5527900Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5528120Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5528228Z self.run() 2022-11-23T02:48:20.5528438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5528572Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5528984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5529121Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5529495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5529624Z getattr(self, test_name)() 2022-11-23T02:48:20.5529994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5530090Z fn() 2022-11-23T02:48:20.5530472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5530583Z test(self, **param_kwargs) 2022-11-23T02:48:20.5530949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5531074Z return func(*args, **kwargs) 2022-11-23T02:48:20.5531332Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5531454Z self.run_subtests( 2022-11-23T02:48:20.5531819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5531988Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5532412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5532564Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5532952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5533070Z output = model(*input) 2022-11-23T02:48:20.5533402Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5533546Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5533939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5534126Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5534503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5534616Z _lazy_init(state, module) 2022-11-23T02:48:20.5534979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5535122Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5535464Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5535591Z return func(*args, **kwargs) 2022-11-23T02:48:20.5535971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5536076Z p_assert( 2022-11-23T02:48:20.5536420Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5536533Z traceback.print_stack() 2022-11-23T02:48:20.5536779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5537027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5537159Z File "", line 1, in 2022-11-23T02:48:20.5537378Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5537524Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5537729Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5537883Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5538084Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5538261Z self.run() 2022-11-23T02:48:20.5538465Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5538616Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5538972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5539111Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5539487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5539597Z getattr(self, test_name)() 2022-11-23T02:48:20.5539971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5540069Z fn() 2022-11-23T02:48:20.5540443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5540565Z test(self, **param_kwargs) 2022-11-23T02:48:20.5540934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5541061Z return func(*args, **kwargs) 2022-11-23T02:48:20.5541320Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5541419Z self.run_subtests( 2022-11-23T02:48:20.5541831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5542011Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5542383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5542537Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5542920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5543043Z output = model(*input) 2022-11-23T02:48:20.5543381Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5543507Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5543888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5544074Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5544448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5544573Z _lazy_init(state, module) 2022-11-23T02:48:20.5544936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5545081Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5545427Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5545544Z return func(*args, **kwargs) 2022-11-23T02:48:20.5545934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5546042Z p_assert( 2022-11-23T02:48:20.5546392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5546528Z traceback.print_stack() 2022-11-23T02:48:20.5546661Z File "", line 1, in 2022-11-23T02:48:20.5546875Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5547014Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5547202Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5547359Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5547576Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5547745Z self.run() 2022-11-23T02:48:20.5547955Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5548098Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5548443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5548581Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5548936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5549063Z getattr(self, test_name)() 2022-11-23T02:48:20.5549431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5549534Z fn() 2022-11-23T02:48:20.5549909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5550033Z test(self, **param_kwargs) 2022-11-23T02:48:20.5550404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5550517Z return func(*args, **kwargs) 2022-11-23T02:48:20.5550770Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5550887Z self.run_subtests( 2022-11-23T02:48:20.5551296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5551474Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5551852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5552010Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5552389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5552518Z output = model(*input) 2022-11-23T02:48:20.5552833Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5552972Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5553359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5553544Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5553918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5554042Z _lazy_init(state, module) 2022-11-23T02:48:20.5554400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5554548Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5554876Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5555007Z return func(*args, **kwargs) 2022-11-23T02:48:20.5555572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5555677Z p_assert( 2022-11-23T02:48:20.5556027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5556158Z traceback.print_stack() 2022-11-23T02:48:20.5556399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5556645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5556760Z File "", line 1, in 2022-11-23T02:48:20.5556973Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5557115Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5557317Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5557563Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5557778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5557886Z self.run() 2022-11-23T02:48:20.5558075Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5558230Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5558582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5558720Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5559094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5559222Z getattr(self, test_name)() 2022-11-23T02:48:20.5559594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5559701Z fn() 2022-11-23T02:48:20.5560058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5560184Z test(self, **param_kwargs) 2022-11-23T02:48:20.5560547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5560739Z return func(*args, **kwargs) 2022-11-23T02:48:20.5561008Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5561121Z self.run_subtests( 2022-11-23T02:48:20.5561483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5561653Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5562008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5562175Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5562559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5562684Z output = model(*input) 2022-11-23T02:48:20.5563024Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5563171Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5563558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5563737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5564098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5564225Z _lazy_init(state, module) 2022-11-23T02:48:20.5564584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5564733Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5565077Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5565207Z return func(*args, **kwargs) 2022-11-23T02:48:20.5565604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5565710Z p_assert( 2022-11-23T02:48:20.5566038Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5566167Z traceback.print_stack() 2022-11-23T02:48:20.5566295Z File "", line 1, in 2022-11-23T02:48:20.5566509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5566652Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5566947Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5567100Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5567299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5567397Z self.run() 2022-11-23T02:48:20.5567601Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5567760Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5568113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5568252Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5568621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5568749Z getattr(self, test_name)() 2022-11-23T02:48:20.5569099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5569207Z fn() 2022-11-23T02:48:20.5569579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5569704Z test(self, **param_kwargs) 2022-11-23T02:48:20.5570073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5570257Z return func(*args, **kwargs) 2022-11-23T02:48:20.5570528Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5570640Z self.run_subtests( 2022-11-23T02:48:20.5570988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5571153Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5571527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5571691Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5572075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5572200Z output = model(*input) 2022-11-23T02:48:20.5572540Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5572689Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5573056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5573273Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5573654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5573777Z _lazy_init(state, module) 2022-11-23T02:48:20.5574142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5574289Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5574633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5574763Z return func(*args, **kwargs) 2022-11-23T02:48:20.5575138Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5575247Z p_assert( 2022-11-23T02:48:20.5575596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5575728Z traceback.print_stack() 2022-11-23T02:48:20.5575969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5576213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5576409Z File "", line 1, in 2022-11-23T02:48:20.5576625Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5576752Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5576955Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5577108Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5577330Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5577433Z self.run() 2022-11-23T02:48:20.5577638Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5577789Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5578128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5578262Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5578626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5578758Z getattr(self, test_name)() 2022-11-23T02:48:20.5579128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5579227Z fn() 2022-11-23T02:48:20.5579656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5579794Z test(self, **param_kwargs) 2022-11-23T02:48:20.5580147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5580272Z return func(*args, **kwargs) 2022-11-23T02:48:20.5580525Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5580635Z self.run_subtests( 2022-11-23T02:48:20.5580995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5581167Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5581545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5581701Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5582074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5582202Z output = model(*input) 2022-11-23T02:48:20.5582535Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5582673Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5583054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5583237Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5583614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5583742Z _lazy_init(state, module) 2022-11-23T02:48:20.5584105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5584235Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5584582Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5584710Z return func(*args, **kwargs) 2022-11-23T02:48:20.5585094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5585200Z p_assert( 2022-11-23T02:48:20.5585547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5585683Z traceback.print_stack() 2022-11-23T02:48:20.5585917Z File "", line 1, in 2022-11-23T02:48:20.5586134Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5586273Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5586477Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5586632Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5586854Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5586963Z self.run() 2022-11-23T02:48:20.5587165Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5587297Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5587654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5587792Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5588162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5588296Z getattr(self, test_name)() 2022-11-23T02:48:20.5588667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5588767Z fn() 2022-11-23T02:48:20.5589195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5589316Z test(self, **param_kwargs) 2022-11-23T02:48:20.5589683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5589810Z return func(*args, **kwargs) 2022-11-23T02:48:20.5590068Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5590183Z self.run_subtests( 2022-11-23T02:48:20.5590544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5590718Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5591095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5591236Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5591623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5591751Z output = model(*input) 2022-11-23T02:48:20.5592088Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5592235Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5592620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5592799Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5593180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5593290Z _lazy_init(state, module) 2022-11-23T02:48:20.5593649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5593799Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5594141Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5594269Z return func(*args, **kwargs) 2022-11-23T02:48:20.5594659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5594762Z p_assert( 2022-11-23T02:48:20.5595367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5595490Z traceback.print_stack() 2022-11-23T02:48:20.5595834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5596077Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5596208Z File "", line 1, in 2022-11-23T02:48:20.5596421Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5596571Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5596780Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5596917Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5597132Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5597238Z self.run() 2022-11-23T02:48:20.5597444Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5597595Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5597961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5598098Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5598466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5598576Z getattr(self, test_name)() 2022-11-23T02:48:20.5599003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5599116Z fn() 2022-11-23T02:48:20.5599499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5599621Z test(self, **param_kwargs) 2022-11-23T02:48:20.5599985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5600117Z return func(*args, **kwargs) 2022-11-23T02:48:20.5600384Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5600485Z self.run_subtests( 2022-11-23T02:48:20.5600850Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5601017Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5601395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5601552Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5601939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5602066Z output = model(*input) 2022-11-23T02:48:20.5602404Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5602533Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5602928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5603110Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5603490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5603624Z _lazy_init(state, module) 2022-11-23T02:48:20.5603985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5604132Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5604472Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5604584Z return func(*args, **kwargs) 2022-11-23T02:48:20.5604972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5605139Z p_assert( 2022-11-23T02:48:20.5605487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5605620Z traceback.print_stack() 2022-11-23T02:48:20.5605754Z File "", line 1, in 2022-11-23T02:48:20.5605971Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5606116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5606303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5606502Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5606722Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5606832Z self.run() 2022-11-23T02:48:20.5607038Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5607191Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5607543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5607663Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5608030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5608159Z getattr(self, test_name)() 2022-11-23T02:48:20.5608589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5608700Z fn() 2022-11-23T02:48:20.5609080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5609209Z test(self, **param_kwargs) 2022-11-23T02:48:20.5609569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5609680Z return func(*args, **kwargs) 2022-11-23T02:48:20.5609939Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5610058Z self.run_subtests( 2022-11-23T02:48:20.5610423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5610595Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5610971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5611133Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5611524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5611632Z output = model(*input) 2022-11-23T02:48:20.5611968Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5612117Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5612507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5612689Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5613071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5613200Z _lazy_init(state, module) 2022-11-23T02:48:20.5613555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5613684Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5614029Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5614159Z return func(*args, **kwargs) 2022-11-23T02:48:20.5614544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5614713Z p_assert( 2022-11-23T02:48:20.5615065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5615186Z traceback.print_stack() 2022-11-23T02:48:20.5615419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5615647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5615779Z File "", line 1, in 2022-11-23T02:48:20.5615992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5616137Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5616339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5616490Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5616709Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5616821Z self.run() 2022-11-23T02:48:20.5617009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5617154Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5617507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5617691Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5618075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5618198Z getattr(self, test_name)() 2022-11-23T02:48:20.5618569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5618654Z fn() 2022-11-23T02:48:20.5619023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5619155Z test(self, **param_kwargs) 2022-11-23T02:48:20.5619520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5619649Z return func(*args, **kwargs) 2022-11-23T02:48:20.5619909Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5620033Z self.run_subtests( 2022-11-23T02:48:20.5620387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5620537Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5620906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5621066Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5621455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5621581Z output = model(*input) 2022-11-23T02:48:20.5621918Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5622066Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5622459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5622643Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5623000Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5623128Z _lazy_init(state, module) 2022-11-23T02:48:20.5623489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5623636Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5623981Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5624171Z return func(*args, **kwargs) 2022-11-23T02:48:20.5624568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5624670Z p_assert( 2022-11-23T02:48:20.5624996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5625126Z traceback.print_stack() 2022-11-23T02:48:20.5625257Z File "", line 1, in 2022-11-23T02:48:20.5625469Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5625614Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5625821Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5625974Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5626174Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5626285Z self.run() 2022-11-23T02:48:20.5626492Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5626643Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5626989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5627175Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5627560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5627688Z getattr(self, test_name)() 2022-11-23T02:48:20.5628038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5628143Z fn() 2022-11-23T02:48:20.5628513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5628645Z test(self, **param_kwargs) 2022-11-23T02:48:20.5629015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5629142Z return func(*args, **kwargs) 2022-11-23T02:48:20.5629398Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5629519Z self.run_subtests( 2022-11-23T02:48:20.5629864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5630035Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5630409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5630567Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5630958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5631084Z output = model(*input) 2022-11-23T02:48:20.5631415Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5631560Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5631929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5632113Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5632485Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5632607Z _lazy_init(state, module) 2022-11-23T02:48:20.5632965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5633112Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5633539Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5633671Z return func(*args, **kwargs) 2022-11-23T02:48:20.5634043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5634152Z p_assert( 2022-11-23T02:48:20.5634502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5634632Z traceback.print_stack() 2022-11-23T02:48:20.5634877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5635350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5635495Z File "", line 1, in 2022-11-23T02:48:20.5635712Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5635842Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5636051Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5636204Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5636420Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5636524Z self.run() 2022-11-23T02:48:20.5636811Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5636976Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5637321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5637464Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5637836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5637964Z getattr(self, test_name)() 2022-11-23T02:48:20.5638331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5638438Z fn() 2022-11-23T02:48:20.5638811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5638939Z test(self, **param_kwargs) 2022-11-23T02:48:20.5639290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5639420Z return func(*args, **kwargs) 2022-11-23T02:48:20.5639676Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5639794Z self.run_subtests( 2022-11-23T02:48:20.5640151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5640320Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5640698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5640862Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5641232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5641358Z output = model(*input) 2022-11-23T02:48:20.5641692Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5641838Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5642221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5642402Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5642773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5642897Z _lazy_init(state, module) 2022-11-23T02:48:20.5643328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5643474Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5643819Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5643948Z return func(*args, **kwargs) 2022-11-23T02:48:20.5644338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5644451Z p_assert( 2022-11-23T02:48:20.5644794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5644925Z traceback.print_stack() 2022-11-23T02:48:20.5645040Z File "", line 1, in 2022-11-23T02:48:20.5645256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5645412Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5645622Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5645776Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5645994Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5646101Z self.run() 2022-11-23T02:48:20.5646342Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5646505Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5646853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5646988Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5647356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5647486Z getattr(self, test_name)() 2022-11-23T02:48:20.5647847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5647956Z fn() 2022-11-23T02:48:20.5648310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5648433Z test(self, **param_kwargs) 2022-11-23T02:48:20.5648805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5648935Z return func(*args, **kwargs) 2022-11-23T02:48:20.5649190Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5649304Z self.run_subtests( 2022-11-23T02:48:20.5649668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5649835Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5650191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5650355Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5650744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5650863Z output = model(*input) 2022-11-23T02:48:20.5651197Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5651349Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5651731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5651913Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5652274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5652458Z _lazy_init(state, module) 2022-11-23T02:48:20.5652827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5652969Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5653316Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5653444Z return func(*args, **kwargs) 2022-11-23T02:48:20.5653833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5653936Z p_assert( 2022-11-23T02:48:20.5654264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5654397Z traceback.print_stack() 2022-11-23T02:48:20.5654640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5654884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5655020Z File "", line 1, in 2022-11-23T02:48:20.5655234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5655380Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5655587Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5655773Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5656003Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5656111Z self.run() 2022-11-23T02:48:20.5656316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5656466Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5656811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5656948Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5657323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5657432Z getattr(self, test_name)() 2022-11-23T02:48:20.5657800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5657902Z fn() 2022-11-23T02:48:20.5658279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5658405Z test(self, **param_kwargs) 2022-11-23T02:48:20.5658772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5658903Z return func(*args, **kwargs) 2022-11-23T02:48:20.5659142Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5659256Z self.run_subtests( 2022-11-23T02:48:20.5659623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5659788Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5660154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5660315Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5660706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5660832Z output = model(*input) 2022-11-23T02:48:20.5661170Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5661299Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5661680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5661931Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5662306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5662430Z _lazy_init(state, module) 2022-11-23T02:48:20.5662792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5662946Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5663291Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5663402Z return func(*args, **kwargs) 2022-11-23T02:48:20.5663795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5663899Z p_assert( 2022-11-23T02:48:20.5664246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5664382Z traceback.print_stack() 2022-11-23T02:48:20.5664515Z File "", line 1, in 2022-11-23T02:48:20.5664729Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5664857Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5665058Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5665261Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5665491Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5665595Z self.run() 2022-11-23T02:48:20.5665801Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5665954Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5666303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5666423Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5666798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5666924Z getattr(self, test_name)() 2022-11-23T02:48:20.5667290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5667393Z fn() 2022-11-23T02:48:20.5667772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5667899Z test(self, **param_kwargs) 2022-11-23T02:48:20.5668264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5668375Z return func(*args, **kwargs) 2022-11-23T02:48:20.5668633Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5668752Z self.run_subtests( 2022-11-23T02:48:20.5669113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5669278Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5669653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5669808Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5670189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5670294Z output = model(*input) 2022-11-23T02:48:20.5670625Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5670773Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5671155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5671407Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5671786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5671908Z _lazy_init(state, module) 2022-11-23T02:48:20.5672271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5672401Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5672751Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5672881Z return func(*args, **kwargs) 2022-11-23T02:48:20.5673311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5673421Z p_assert( 2022-11-23T02:48:20.5673770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5673905Z traceback.print_stack() 2022-11-23T02:48:20.5674151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5674377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5674621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5674914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5675337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5675579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5675809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5676040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5676278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5676492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5676721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5676955Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5677189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5677421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5677652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5677885Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5678113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5678332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5678561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5678789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5679026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5679254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5679481Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5679712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5679942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5680169Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5680469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5680702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5680931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5681167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5681396Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5681628Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5681856Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5682082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5682293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5682522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5682748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5682972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5683266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5683516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5683746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5683976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5684206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5684425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5684652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5684882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5685118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5685346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5685464Z dist init r=1, world=2 2022-11-23T02:48:20.5685807Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5686130Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5686434Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5686749Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5687058Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5687368Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5687672Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5687978Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5688340Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5688648Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5688981Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5689302Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5689415Z dist init r=0, world=2 2022-11-23T02:48:20.5689722Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5690015Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5690363Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5690679Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5690983Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5691288Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5691598Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5691909Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5692219Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5692524Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5692832Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5693139Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5693228Z ok (5.713s) 2022-11-23T02:48:20.5693578Z test_nested_wrapped_model_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90726 2022-11-23T02:48:20.5693809Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90727 2022-11-23T02:48:20.5694209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5694389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5694772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5694966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5695417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5695598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5695967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5696170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5696423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5696672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5697083Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5697490Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5697732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5697965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5698195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5698464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5699509Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5699623Z warnings.warn( 2022-11-23T02:48:20.5700653Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5700770Z warnings.warn( 2022-11-23T02:48:20.5700905Z File "", line 1, in 2022-11-23T02:48:20.5701124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5701275Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5701485Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5701639Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5701840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5701952Z self.run() 2022-11-23T02:48:20.5702158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5702310Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5702662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5702804Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5703177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5703305Z getattr(self, test_name)() 2022-11-23T02:48:20.5703653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5703754Z fn() 2022-11-23T02:48:20.5704123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5704317Z test(self, **param_kwargs) 2022-11-23T02:48:20.5704685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5704816Z return func(*args, **kwargs) 2022-11-23T02:48:20.5705079Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5705199Z self.run_subtests( 2022-11-23T02:48:20.5705541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5705709Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5706083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5706245Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5706633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5706762Z output = model(*input) 2022-11-23T02:48:20.5707099Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5707246Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5707664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5707857Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5708238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5708367Z _lazy_init(state, module) 2022-11-23T02:48:20.5708729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5708878Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5709225Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5709359Z return func(*args, **kwargs) 2022-11-23T02:48:20.5709731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5709840Z p_assert( 2022-11-23T02:48:20.5710186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5710315Z traceback.print_stack() 2022-11-23T02:48:20.5710446Z File "", line 1, in 2022-11-23T02:48:20.5710661Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5710806Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5711008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5711146Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5711361Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5711472Z self.run() 2022-11-23T02:48:20.5711683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5711834Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5712188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5712330Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5712685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5712816Z getattr(self, test_name)() 2022-11-23T02:48:20.5713181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5713282Z fn() 2022-11-23T02:48:20.5713652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5713849Z test(self, **param_kwargs) 2022-11-23T02:48:20.5714219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5714345Z return func(*args, **kwargs) 2022-11-23T02:48:20.5714584Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5714705Z self.run_subtests( 2022-11-23T02:48:20.5715326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5715511Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5715895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5716055Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5716436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5716563Z output = model(*input) 2022-11-23T02:48:20.5716876Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5717019Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5717482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5717677Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5718059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5718186Z _lazy_init(state, module) 2022-11-23T02:48:20.5718543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5718693Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5719027Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5719158Z return func(*args, **kwargs) 2022-11-23T02:48:20.5719550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5719654Z p_assert( 2022-11-23T02:48:20.5720004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5720138Z traceback.print_stack() 2022-11-23T02:48:20.5720381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5720623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5720740Z File "", line 1, in 2022-11-23T02:48:20.5720957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5721102Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5721315Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5721464Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5721683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5721792Z self.run() 2022-11-23T02:48:20.5722004Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5722137Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5722489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5722626Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5722997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5723125Z getattr(self, test_name)() 2022-11-23T02:48:20.5723487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5723672Z fn() 2022-11-23T02:48:20.5724036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5724164Z test(self, **param_kwargs) 2022-11-23T02:48:20.5724534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5724667Z return func(*args, **kwargs) 2022-11-23T02:48:20.5724927Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5725045Z self.run_subtests( 2022-11-23T02:48:20.5725411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5725578Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5725932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5726098Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5726478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5726603Z output = model(*input) 2022-11-23T02:48:20.5727044Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5727207Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5727593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5727776Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5728149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5728257Z _lazy_init(state, module) 2022-11-23T02:48:20.5728622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5728768Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5729115Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5729243Z return func(*args, **kwargs) 2022-11-23T02:48:20.5729635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5729742Z p_assert( 2022-11-23T02:48:20.5730069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5730199Z traceback.print_stack() 2022-11-23T02:48:20.5730329Z File "", line 1, in 2022-11-23T02:48:20.5730539Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5730694Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5730902Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5731056Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5731273Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5731362Z self.run() 2022-11-23T02:48:20.5731573Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5731723Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5732069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5732207Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5732576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5732703Z getattr(self, test_name)() 2022-11-23T02:48:20.5733069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5733215Z fn() 2022-11-23T02:48:20.5733593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5733722Z test(self, **param_kwargs) 2022-11-23T02:48:20.5734087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5734214Z return func(*args, **kwargs) 2022-11-23T02:48:20.5734473Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5734587Z self.run_subtests( 2022-11-23T02:48:20.5734931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5735101Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5735476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5735641Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5736024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5736147Z output = model(*input) 2022-11-23T02:48:20.5736533Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5736690Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5737079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5737242Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5737617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5737745Z _lazy_init(state, module) 2022-11-23T02:48:20.5738106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5738252Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5738601Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5738735Z return func(*args, **kwargs) 2022-11-23T02:48:20.5739124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5739214Z p_assert( 2022-11-23T02:48:20.5739560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5739694Z traceback.print_stack() 2022-11-23T02:48:20.5739938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5740178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5740316Z File "", line 1, in 2022-11-23T02:48:20.5740525Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5740654Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5740864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5741022Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5741242Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5741352Z self.run() 2022-11-23T02:48:20.5741555Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5741706Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5742060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5742181Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5742621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5742749Z getattr(self, test_name)() 2022-11-23T02:48:20.5743120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5743224Z fn() 2022-11-23T02:48:20.5743598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5743726Z test(self, **param_kwargs) 2022-11-23T02:48:20.5744089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5744201Z return func(*args, **kwargs) 2022-11-23T02:48:20.5744459Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5744577Z self.run_subtests( 2022-11-23T02:48:20.5744939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5745105Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5745477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5745635Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5746070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5746185Z output = model(*input) 2022-11-23T02:48:20.5746523Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5746669Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5747050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5747235Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5747609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5747734Z _lazy_init(state, module) 2022-11-23T02:48:20.5748097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5748233Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5748583Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5748712Z return func(*args, **kwargs) 2022-11-23T02:48:20.5749104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5749213Z p_assert( 2022-11-23T02:48:20.5749559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5749696Z traceback.print_stack() 2022-11-23T02:48:20.5749827Z File "", line 1, in 2022-11-23T02:48:20.5750024Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5750170Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5750380Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5750537Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5750753Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5750860Z self.run() 2022-11-23T02:48:20.5751066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5751198Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5751549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5751682Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5752125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5752250Z getattr(self, test_name)() 2022-11-23T02:48:20.5752617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5752719Z fn() 2022-11-23T02:48:20.5753095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5753204Z test(self, **param_kwargs) 2022-11-23T02:48:20.5753569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5753701Z return func(*args, **kwargs) 2022-11-23T02:48:20.5753956Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5754074Z self.run_subtests( 2022-11-23T02:48:20.5754442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5754613Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5754986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5755378Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5755794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5755921Z output = model(*input) 2022-11-23T02:48:20.5756256Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5756404Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5756792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5756982Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5757362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5757470Z _lazy_init(state, module) 2022-11-23T02:48:20.5757841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5757989Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5758332Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5758461Z return func(*args, **kwargs) 2022-11-23T02:48:20.5758851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5758962Z p_assert( 2022-11-23T02:48:20.5759311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5759429Z traceback.print_stack() 2022-11-23T02:48:20.5759673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5759914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5760049Z File "", line 1, in 2022-11-23T02:48:20.5760268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5760415Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5760621Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5760778Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5760979Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5761087Z self.run() 2022-11-23T02:48:20.5761295Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5761545Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5761901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5762039Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5762405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5762539Z getattr(self, test_name)() 2022-11-23T02:48:20.5762892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5762997Z fn() 2022-11-23T02:48:20.5763367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5763491Z test(self, **param_kwargs) 2022-11-23T02:48:20.5763852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5763985Z return func(*args, **kwargs) 2022-11-23T02:48:20.5764247Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5764347Z self.run_subtests( 2022-11-23T02:48:20.5764711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5764942Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5765327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5765490Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5765873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5765997Z output = model(*input) 2022-11-23T02:48:20.5766333Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5766470Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5766861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5767046Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5767432Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5767557Z _lazy_init(state, module) 2022-11-23T02:48:20.5767920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5768068Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5768415Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5768547Z return func(*args, **kwargs) 2022-11-23T02:48:20.5768916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5769026Z p_assert( 2022-11-23T02:48:20.5769377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5769510Z traceback.print_stack() 2022-11-23T02:48:20.5769644Z File "", line 1, in 2022-11-23T02:48:20.5769863Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5770013Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5770204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5770358Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5770576Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5770687Z self.run() 2022-11-23T02:48:20.5770896Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5771110Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5771464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5771604Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5771958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5772091Z getattr(self, test_name)() 2022-11-23T02:48:20.5772460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5772565Z fn() 2022-11-23T02:48:20.5772938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5773063Z test(self, **param_kwargs) 2022-11-23T02:48:20.5773472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5773608Z return func(*args, **kwargs) 2022-11-23T02:48:20.5773849Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5773964Z self.run_subtests( 2022-11-23T02:48:20.5774326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5774549Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5774935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5775095Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5775482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5775610Z output = model(*input) 2022-11-23T02:48:20.5775925Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5776075Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5776465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5776645Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5777025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5777150Z _lazy_init(state, module) 2022-11-23T02:48:20.5777510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5777660Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5777987Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5778119Z return func(*args, **kwargs) 2022-11-23T02:48:20.5778508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5778622Z p_assert( 2022-11-23T02:48:20.5778970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5779102Z traceback.print_stack() 2022-11-23T02:48:20.5779348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5779587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5779701Z File "", line 1, in 2022-11-23T02:48:20.5779913Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5780059Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5780265Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5780419Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5780706Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5780817Z self.run() 2022-11-23T02:48:20.5781006Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5781156Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5781514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5781651Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5782020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5782150Z getattr(self, test_name)() 2022-11-23T02:48:20.5782519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5782624Z fn() 2022-11-23T02:48:20.5782978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5783111Z test(self, **param_kwargs) 2022-11-23T02:48:20.5783486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5783619Z return func(*args, **kwargs) 2022-11-23T02:48:20.5783928Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5784057Z self.run_subtests( 2022-11-23T02:48:20.5784421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5784591Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5784944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5785104Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5785492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5785623Z output = model(*input) 2022-11-23T02:48:20.5785958Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5786107Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5786497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5786683Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5787042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5787171Z _lazy_init(state, module) 2022-11-23T02:48:20.5787535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5787682Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5788032Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5788164Z return func(*args, **kwargs) 2022-11-23T02:48:20.5788550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5788659Z p_assert( 2022-11-23T02:48:20.5788993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5789124Z traceback.print_stack() 2022-11-23T02:48:20.5789257Z File "", line 1, in 2022-11-23T02:48:20.5789472Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5789619Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5789827Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5789984Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5790269Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5790360Z self.run() 2022-11-23T02:48:20.5790570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5790720Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5791074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5791214Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5791588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5791719Z getattr(self, test_name)() 2022-11-23T02:48:20.5792068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5792170Z fn() 2022-11-23T02:48:20.5792544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5792675Z test(self, **param_kwargs) 2022-11-23T02:48:20.5793040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5793169Z return func(*args, **kwargs) 2022-11-23T02:48:20.5793477Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5793607Z self.run_subtests( 2022-11-23T02:48:20.5793951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5794120Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5794491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5794650Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5795249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5795392Z output = model(*input) 2022-11-23T02:48:20.5795732Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5795878Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5796247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5796431Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5796806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5796931Z _lazy_init(state, module) 2022-11-23T02:48:20.5797291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5797440Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5797788Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5797919Z return func(*args, **kwargs) 2022-11-23T02:48:20.5798289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5798398Z p_assert( 2022-11-23T02:48:20.5798746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5798878Z traceback.print_stack() 2022-11-23T02:48:20.5799124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5799365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5799499Z File "", line 1, in 2022-11-23T02:48:20.5799712Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5799937Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5800146Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5800302Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5800516Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5800625Z self.run() 2022-11-23T02:48:20.5800833Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5800985Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5801341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5801461Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5801824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5801953Z getattr(self, test_name)() 2022-11-23T02:48:20.5802325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5802427Z fn() 2022-11-23T02:48:20.5802803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5802931Z test(self, **param_kwargs) 2022-11-23T02:48:20.5803341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5803484Z return func(*args, **kwargs) 2022-11-23T02:48:20.5803745Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5803862Z self.run_subtests( 2022-11-23T02:48:20.5804227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5804395Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5804773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5804928Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5805315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5805421Z output = model(*input) 2022-11-23T02:48:20.5805761Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5805909Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5806294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5806477Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5806853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5806990Z _lazy_init(state, module) 2022-11-23T02:48:20.5807351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5807482Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5807832Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5807967Z return func(*args, **kwargs) 2022-11-23T02:48:20.5808359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5808468Z p_assert( 2022-11-23T02:48:20.5808817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5808947Z traceback.print_stack() 2022-11-23T02:48:20.5809061Z File "", line 1, in 2022-11-23T02:48:20.5809276Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5809498Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5809709Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5809863Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5810087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5810201Z self.run() 2022-11-23T02:48:20.5810408Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5810539Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5810893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5811030Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5811397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5811526Z getattr(self, test_name)() 2022-11-23T02:48:20.5811897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5811999Z fn() 2022-11-23T02:48:20.5812371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5812483Z test(self, **param_kwargs) 2022-11-23T02:48:20.5812902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5813042Z return func(*args, **kwargs) 2022-11-23T02:48:20.5813298Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5813415Z self.run_subtests( 2022-11-23T02:48:20.5813778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5813944Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5814323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5814464Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5814852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5814984Z output = model(*input) 2022-11-23T02:48:20.5815320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5815465Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5815850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5816033Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5816412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5816523Z _lazy_init(state, module) 2022-11-23T02:48:20.5816881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5817030Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5817377Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5817509Z return func(*args, **kwargs) 2022-11-23T02:48:20.5817897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5818004Z p_assert( 2022-11-23T02:48:20.5818345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5818457Z traceback.print_stack() 2022-11-23T02:48:20.5818701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5819014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5819150Z File "", line 1, in 2022-11-23T02:48:20.5819370Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5819516Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5819727Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5819867Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5820086Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5820193Z self.run() 2022-11-23T02:48:20.5820398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5820548Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5820901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5821043Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5821409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5821518Z getattr(self, test_name)() 2022-11-23T02:48:20.5821884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5822037Z fn() 2022-11-23T02:48:20.5822423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5822550Z test(self, **param_kwargs) 2022-11-23T02:48:20.5822914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5823045Z return func(*args, **kwargs) 2022-11-23T02:48:20.5823307Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5823411Z self.run_subtests( 2022-11-23T02:48:20.5823772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5823939Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5824312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5824475Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5824861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5824986Z output = model(*input) 2022-11-23T02:48:20.5825320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5825447Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5825833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5826020Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5826397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5826524Z _lazy_init(state, module) 2022-11-23T02:48:20.5826889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5827039Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5827384Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5827495Z return func(*args, **kwargs) 2022-11-23T02:48:20.5827884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5827993Z p_assert( 2022-11-23T02:48:20.5828340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5828545Z traceback.print_stack() 2022-11-23T02:48:20.5828680Z File "", line 1, in 2022-11-23T02:48:20.5828892Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5829042Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5829234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5829390Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5829608Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5829717Z self.run() 2022-11-23T02:48:20.5829924Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5830074Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5830427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5830552Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5830920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5831049Z getattr(self, test_name)() 2022-11-23T02:48:20.5831417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5831568Z fn() 2022-11-23T02:48:20.5831951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5832077Z test(self, **param_kwargs) 2022-11-23T02:48:20.5832439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5832551Z return func(*args, **kwargs) 2022-11-23T02:48:20.5832807Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5832929Z self.run_subtests( 2022-11-23T02:48:20.5833289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5833459Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5833833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5833997Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5834384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5834489Z output = model(*input) 2022-11-23T02:48:20.5834822Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5834968Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5835657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5835848Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5836228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5836356Z _lazy_init(state, module) 2022-11-23T02:48:20.5836721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5836852Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5837195Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5837326Z return func(*args, **kwargs) 2022-11-23T02:48:20.5837717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5837825Z p_assert( 2022-11-23T02:48:20.5838170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5838402Z traceback.print_stack() 2022-11-23T02:48:20.5838649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5838874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5839009Z File "", line 1, in 2022-11-23T02:48:20.5839227Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5839370Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5839576Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5839734Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5839953Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5840063Z self.run() 2022-11-23T02:48:20.5840253Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5840410Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5840770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5840907Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5841344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5841486Z getattr(self, test_name)() 2022-11-23T02:48:20.5841859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5841960Z fn() 2022-11-23T02:48:20.5842314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5842443Z test(self, **param_kwargs) 2022-11-23T02:48:20.5842809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5842946Z return func(*args, **kwargs) 2022-11-23T02:48:20.5843209Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5843325Z self.run_subtests( 2022-11-23T02:48:20.5843692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5843841Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5844219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5844379Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5844765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5844890Z output = model(*input) 2022-11-23T02:48:20.5845228Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5845381Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5845770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5845949Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5846312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5846440Z _lazy_init(state, module) 2022-11-23T02:48:20.5846801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5846951Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5847298Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5847431Z return func(*args, **kwargs) 2022-11-23T02:48:20.5847894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5848003Z p_assert( 2022-11-23T02:48:20.5848330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5848463Z traceback.print_stack() 2022-11-23T02:48:20.5848599Z File "", line 1, in 2022-11-23T02:48:20.5848813Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5848959Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5849160Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5849316Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5849515Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5849622Z self.run() 2022-11-23T02:48:20.5849830Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5849985Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5850333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5850470Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5850897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5851037Z getattr(self, test_name)() 2022-11-23T02:48:20.5851390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5851498Z fn() 2022-11-23T02:48:20.5851874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5852004Z test(self, **param_kwargs) 2022-11-23T02:48:20.5852370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5852507Z return func(*args, **kwargs) 2022-11-23T02:48:20.5852763Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5852881Z self.run_subtests( 2022-11-23T02:48:20.5853226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5853394Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5853770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5853927Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5854310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5854435Z output = model(*input) 2022-11-23T02:48:20.5854776Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5854926Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5855294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5855476Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5855857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5855984Z _lazy_init(state, module) 2022-11-23T02:48:20.5856344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5856490Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5856838Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5856969Z return func(*args, **kwargs) 2022-11-23T02:48:20.5857409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5857517Z p_assert( 2022-11-23T02:48:20.5857866Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5857999Z traceback.print_stack() 2022-11-23T02:48:20.5858247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5858491Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5858625Z File "", line 1, in 2022-11-23T02:48:20.5858838Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5858966Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5859174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5859336Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5859557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5859666Z self.run() 2022-11-23T02:48:20.5859873Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5860026Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5860409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5860557Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5860936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5861064Z getattr(self, test_name)() 2022-11-23T02:48:20.5861432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5861536Z fn() 2022-11-23T02:48:20.5861909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5862046Z test(self, **param_kwargs) 2022-11-23T02:48:20.5862398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5862530Z return func(*args, **kwargs) 2022-11-23T02:48:20.5862798Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5862920Z self.run_subtests( 2022-11-23T02:48:20.5863285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5863453Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5863824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5863986Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5864361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5864490Z output = model(*input) 2022-11-23T02:48:20.5864832Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5864980Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5865369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5865551Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5865925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5866051Z _lazy_init(state, module) 2022-11-23T02:48:20.5866391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5866624Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5866979Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5867109Z return func(*args, **kwargs) 2022-11-23T02:48:20.5867498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5867611Z p_assert( 2022-11-23T02:48:20.5867961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5868093Z traceback.print_stack() 2022-11-23T02:48:20.5868207Z File "", line 1, in 2022-11-23T02:48:20.5868422Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5868564Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5868769Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5868929Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5869147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5869254Z self.run() 2022-11-23T02:48:20.5869442Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5869591Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5869998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5870147Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5870522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5870650Z getattr(self, test_name)() 2022-11-23T02:48:20.5871018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5871121Z fn() 2022-11-23T02:48:20.5871484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5871613Z test(self, **param_kwargs) 2022-11-23T02:48:20.5871978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5872109Z return func(*args, **kwargs) 2022-11-23T02:48:20.5872371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5872486Z self.run_subtests( 2022-11-23T02:48:20.5872844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5873013Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5873409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5873572Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5873967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5874093Z output = model(*input) 2022-11-23T02:48:20.5874426Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5874576Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5874962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5875452Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5875825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5875953Z _lazy_init(state, module) 2022-11-23T02:48:20.5876312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5876610Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5876966Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5877095Z return func(*args, **kwargs) 2022-11-23T02:48:20.5877488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5877599Z p_assert( 2022-11-23T02:48:20.5877927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5878058Z traceback.print_stack() 2022-11-23T02:48:20.5878301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5878544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5878677Z File "", line 1, in 2022-11-23T02:48:20.5878895Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5879047Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5879253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5879391Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5879609Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5879782Z self.run() 2022-11-23T02:48:20.5880003Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5880151Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5880502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5880641Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5881014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5881128Z getattr(self, test_name)() 2022-11-23T02:48:20.5881499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5881598Z fn() 2022-11-23T02:48:20.5881970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5882101Z test(self, **param_kwargs) 2022-11-23T02:48:20.5882469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5882601Z return func(*args, **kwargs) 2022-11-23T02:48:20.5882861Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5882962Z self.run_subtests( 2022-11-23T02:48:20.5883324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5883492Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5883870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5884030Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5884416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5884545Z output = model(*input) 2022-11-23T02:48:20.5884881Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5885008Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5885399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5885584Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5885964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5886157Z _lazy_init(state, module) 2022-11-23T02:48:20.5886527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5886677Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5887024Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5887135Z return func(*args, **kwargs) 2022-11-23T02:48:20.5887521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5887627Z p_assert( 2022-11-23T02:48:20.5887972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5888102Z traceback.print_stack() 2022-11-23T02:48:20.5888237Z File "", line 1, in 2022-11-23T02:48:20.5888458Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5888587Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5888796Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5888950Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5889219Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5889335Z self.run() 2022-11-23T02:48:20.5889545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5889695Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5890047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5890166Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5890536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5890672Z getattr(self, test_name)() 2022-11-23T02:48:20.5891043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5891149Z fn() 2022-11-23T02:48:20.5891523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5891655Z test(self, **param_kwargs) 2022-11-23T02:48:20.5892019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5892131Z return func(*args, **kwargs) 2022-11-23T02:48:20.5892392Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5892512Z self.run_subtests( 2022-11-23T02:48:20.5892873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5893047Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5893422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5893580Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5893971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5894078Z output = model(*input) 2022-11-23T02:48:20.5894413Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5894560Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5894948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5895128Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5895502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5895712Z _lazy_init(state, module) 2022-11-23T02:48:20.5896082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5896211Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5896559Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5896692Z return func(*args, **kwargs) 2022-11-23T02:48:20.5897081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5897192Z p_assert( 2022-11-23T02:48:20.5897539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5897669Z traceback.print_stack() 2022-11-23T02:48:20.5897911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5898137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5898267Z File "", line 1, in 2022-11-23T02:48:20.5898477Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5898622Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5898879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5899047Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5899266Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5899354Z self.run() 2022-11-23T02:48:20.5899559Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5899708Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5900063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5900207Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5900583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5900714Z getattr(self, test_name)() 2022-11-23T02:48:20.5901092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5901176Z fn() 2022-11-23T02:48:20.5901549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5901678Z test(self, **param_kwargs) 2022-11-23T02:48:20.5902047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5902178Z return func(*args, **kwargs) 2022-11-23T02:48:20.5902438Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5902562Z self.run_subtests( 2022-11-23T02:48:20.5902926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5903077Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5903460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5903620Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5904007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5904135Z output = model(*input) 2022-11-23T02:48:20.5904473Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5904620Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5905005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5905237Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5905620Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5905747Z _lazy_init(state, module) 2022-11-23T02:48:20.5906110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5906256Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5906655Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5906792Z return func(*args, **kwargs) 2022-11-23T02:48:20.5907189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5907278Z p_assert( 2022-11-23T02:48:20.5907629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5907761Z traceback.print_stack() 2022-11-23T02:48:20.5907894Z File "", line 1, in 2022-11-23T02:48:20.5908110Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5908259Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5908519Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5908686Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5908885Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5908992Z self.run() 2022-11-23T02:48:20.5909199Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5909351Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5909704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5909849Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5910220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5910330Z getattr(self, test_name)() 2022-11-23T02:48:20.5910699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5910804Z fn() 2022-11-23T02:48:20.5911177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5911306Z test(self, **param_kwargs) 2022-11-23T02:48:20.5911670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5911810Z return func(*args, **kwargs) 2022-11-23T02:48:20.5912051Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5912175Z self.run_subtests( 2022-11-23T02:48:20.5912534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5912703Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5913083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5913242Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5913626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5913750Z output = model(*input) 2022-11-23T02:48:20.5914067Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5914212Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5914598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5914848Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5915433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5915564Z _lazy_init(state, module) 2022-11-23T02:48:20.5915941Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5916089Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5916421Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5916552Z return func(*args, **kwargs) 2022-11-23T02:48:20.5916937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5917045Z p_assert( 2022-11-23T02:48:20.5917391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5917520Z traceback.print_stack() 2022-11-23T02:48:20.5917761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5918003Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5918202Z File "", line 1, in 2022-11-23T02:48:20.5918434Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5918577Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5918782Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5918935Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5919151Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5919259Z self.run() 2022-11-23T02:48:20.5919454Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5919605Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5919960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5920098Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5920473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5920601Z getattr(self, test_name)() 2022-11-23T02:48:20.5920967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5921070Z fn() 2022-11-23T02:48:20.5921424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5921552Z test(self, **param_kwargs) 2022-11-23T02:48:20.5921915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5922051Z return func(*args, **kwargs) 2022-11-23T02:48:20.5922311Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5922429Z self.run_subtests( 2022-11-23T02:48:20.5922793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5922961Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5923317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5923474Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5923855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5923978Z output = model(*input) 2022-11-23T02:48:20.5924395Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5924543Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5924930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5925114Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5925472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5925601Z _lazy_init(state, module) 2022-11-23T02:48:20.5925960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5926105Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5926453Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5926588Z return func(*args, **kwargs) 2022-11-23T02:48:20.5926972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5927080Z p_assert( 2022-11-23T02:48:20.5927406Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5927588Z traceback.print_stack() 2022-11-23T02:48:20.5927733Z File "", line 1, in 2022-11-23T02:48:20.5927952Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5928096Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5928303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5928459Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5928678Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5928773Z self.run() 2022-11-23T02:48:20.5928979Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5929128Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5929481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5929621Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5929995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5930122Z getattr(self, test_name)() 2022-11-23T02:48:20.5930489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5930573Z fn() 2022-11-23T02:48:20.5930945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5931076Z test(self, **param_kwargs) 2022-11-23T02:48:20.5931446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5931572Z return func(*args, **kwargs) 2022-11-23T02:48:20.5931837Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:48:20.5931955Z self.run_subtests( 2022-11-23T02:48:20.5932301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.5932472Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.5932846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.5933005Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.5933393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.5933582Z output = model(*input) 2022-11-23T02:48:20.5933917Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.5934062Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.5934444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.5934613Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.5934988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.5935114Z _lazy_init(state, module) 2022-11-23T02:48:20.5935474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.5935623Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.5935971Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.5936107Z return func(*args, **kwargs) 2022-11-23T02:48:20.5936497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.5936585Z p_assert( 2022-11-23T02:48:20.5936933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.5937113Z traceback.print_stack() 2022-11-23T02:48:20.5937363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5937602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5937843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5938083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5938318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5938542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5938775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5939011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5939248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5939480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5939711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5939943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5940172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5940385Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5940618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5940848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5941079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5941315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5941547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5941779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5942008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5942220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5942446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5942742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5942970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5943199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5943430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5943656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5943882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5944109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5944318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5944552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5944781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5945014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5945296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5945535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5945762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5945990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5946200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5946429Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5946661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5946891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5947119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5947354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5947582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5947811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5948038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5948248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.5948361Z dist init r=1, world=2 2022-11-23T02:48:20.5948704Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5949032Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5949354Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5949667Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5949979Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5950291Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5950666Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5950978Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5951268Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5951574Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5951882Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5952191Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.5952309Z dist init r=0, world=2 2022-11-23T02:48:20.5952703Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5953034Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5953351Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5953664Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5953983Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5954292Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5954588Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5954894Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5955503Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5955826Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5956131Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5956439Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.5956546Z ok (5.713s) 2022-11-23T02:48:20.5956931Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90809 2022-11-23T02:48:20.5957158Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90810 2022-11-23T02:48:20.5957563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5957861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5958240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5958434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5958816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5958998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5959389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5959587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5959841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5960099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5960495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5960904Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5961206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5961450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5962492Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5962616Z warnings.warn( 2022-11-23T02:48:20.5963644Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5963759Z warnings.warn( 2022-11-23T02:48:20.5964520Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5965276Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5966037Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5966792Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5967598Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5968344Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5969088Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5969835Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5970625Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5971374Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.5971496Z dist init r=0, world=2 2022-11-23T02:48:20.5971609Z dist init r=1, world=2 2022-11-23T02:48:20.5971693Z ok (4.712s) 2022-11-23T02:48:20.5972072Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90892 2022-11-23T02:48:20.5972301Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90893 2022-11-23T02:48:20.5972685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5972869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5973298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5973498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5973882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5974068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5974437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5974641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5974892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5975141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5975553Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5975964Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5976267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5976503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5977544Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5977664Z warnings.warn( 2022-11-23T02:48:20.5978866Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5979088Z warnings.warn( 2022-11-23T02:48:20.5979285Z dist init r=0, world=2 2022-11-23T02:48:20.5979490Z dist init r=1, world=2 2022-11-23T02:48:20.5979662Z ok (4.912s) 2022-11-23T02:48:20.5980130Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90975 2022-11-23T02:48:20.5980367Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90976 2022-11-23T02:48:20.5980761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5980942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5981340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5981519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5981897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5982082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5982469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5982660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5982914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5983168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5983576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5983989Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5984208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5984443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5985470Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5985588Z warnings.warn( 2022-11-23T02:48:20.5986868Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5986990Z warnings.warn( 2022-11-23T02:48:20.5987106Z dist init r=0, world=2 2022-11-23T02:48:20.5987216Z dist init r=1, world=2 2022-11-23T02:48:20.5987321Z ok (4.912s) 2022-11-23T02:48:20.5987708Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91058 2022-11-23T02:48:20.5987917Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91059 2022-11-23T02:48:20.5988313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5988493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5988882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5989136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5989523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.5989701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.5990087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.5990282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.5990513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.5990769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.5991179Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5991587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.5991824Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.5992055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.5993083Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5993208Z warnings.warn( 2022-11-23T02:48:20.5994374Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.5994501Z warnings.warn( 2022-11-23T02:48:20.5994745Z File "", line 1, in 2022-11-23T02:48:20.5995311Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.5995479Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.5995798Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.5995952Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.5996172Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.5996330Z self.run() 2022-11-23T02:48:20.5996543Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.5996695Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.5997197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.5997336Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.5997708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.5997834Z getattr(self, test_name)() 2022-11-23T02:48:20.5998203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.5998310Z fn() 2022-11-23T02:48:20.5998687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.5998815Z test(self, **param_kwargs) 2022-11-23T02:48:20.5999235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.5999377Z return func(*args, **kwargs) 2022-11-23T02:48:20.5999677Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.5999791Z self.run_subtests( 2022-11-23T02:48:20.6000154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6000319Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6000695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6000853Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6001218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6001339Z output = model(*input) 2022-11-23T02:48:20.6001682Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6001830Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6002216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6002398Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6002774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6002906Z _lazy_init(state, module) 2022-11-23T02:48:20.6003251Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6003398Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6003747Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6003880Z return func(*args, **kwargs) 2022-11-23T02:48:20.6004274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6004382Z p_assert( 2022-11-23T02:48:20.6004729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6004859Z traceback.print_stack() 2022-11-23T02:48:20.6004973Z File "", line 1, in 2022-11-23T02:48:20.6005184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6005397Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6005601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6005755Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6005973Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6006079Z self.run() 2022-11-23T02:48:20.6006272Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6006423Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6006778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6006917Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6007290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6007416Z getattr(self, test_name)() 2022-11-23T02:48:20.6007787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6007890Z fn() 2022-11-23T02:48:20.6008245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6008374Z test(self, **param_kwargs) 2022-11-23T02:48:20.6008788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6008925Z return func(*args, **kwargs) 2022-11-23T02:48:20.6009229Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6009345Z self.run_subtests( 2022-11-23T02:48:20.6009706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6009872Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6010234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6010390Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6010779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6010904Z output = model(*input) 2022-11-23T02:48:20.6011239Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6011387Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6011774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6011959Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6012318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6012444Z _lazy_init(state, module) 2022-11-23T02:48:20.6012803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6012950Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6013301Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6013432Z return func(*args, **kwargs) 2022-11-23T02:48:20.6013821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6013929Z p_assert( 2022-11-23T02:48:20.6014260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6014390Z traceback.print_stack() 2022-11-23T02:48:20.6014664Z File "", line 1, in 2022-11-23T02:48:20.6014883Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6015112Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6015490Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6015769Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6015997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6016092Z self.run() 2022-11-23T02:48:20.6016301Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6016452Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6016814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6016948Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6017318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6017450Z getattr(self, test_name)() 2022-11-23T02:48:20.6017801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6017903Z fn() 2022-11-23T02:48:20.6018277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6018512Z test(self, **param_kwargs) 2022-11-23T02:48:20.6018892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6019019Z return func(*args, **kwargs) 2022-11-23T02:48:20.6019322Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6019439Z self.run_subtests( 2022-11-23T02:48:20.6019782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6019953Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6020328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6020490Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6020879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6021002Z output = model(*input) 2022-11-23T02:48:20.6021339Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6021481Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6021849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6022029Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6022404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6022536Z _lazy_init(state, module) 2022-11-23T02:48:20.6022894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6023041Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6023390Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6023520Z return func(*args, **kwargs) 2022-11-23T02:48:20.6023912Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6024001Z p_assert( 2022-11-23T02:48:20.6024341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6024468Z traceback.print_stack() 2022-11-23T02:48:20.6024599Z File "", line 1, in 2022-11-23T02:48:20.6024891Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6025038Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6025242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6025381Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6025601Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6025705Z self.run() 2022-11-23T02:48:20.6025911Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6026060Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6026407Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6026545Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6026914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6027030Z getattr(self, test_name)() 2022-11-23T02:48:20.6027394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6027494Z fn() 2022-11-23T02:48:20.6027872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6028050Z test(self, **param_kwargs) 2022-11-23T02:48:20.6028429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6028557Z return func(*args, **kwargs) 2022-11-23T02:48:20.6028860Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6028959Z self.run_subtests( 2022-11-23T02:48:20.6029320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6029494Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6029864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6030020Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6030409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6030532Z output = model(*input) 2022-11-23T02:48:20.6030870Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6031000Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6031383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6031564Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6031944Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6032069Z _lazy_init(state, module) 2022-11-23T02:48:20.6032431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6032579Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6032931Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6033043Z return func(*args, **kwargs) 2022-11-23T02:48:20.6033433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6033538Z p_assert( 2022-11-23T02:48:20.6033884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6034013Z traceback.print_stack() 2022-11-23T02:48:20.6034209Z File "", line 1, in 2022-11-23T02:48:20.6034424Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6034552Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6034760Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6034914Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6035423Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6035535Z self.run() 2022-11-23T02:48:20.6035744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6035893Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6036252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6036370Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6036743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6036880Z getattr(self, test_name)() 2022-11-23T02:48:20.6037248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6037351Z fn() 2022-11-23T02:48:20.6037819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6037961Z test(self, **param_kwargs) 2022-11-23T02:48:20.6038333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6038445Z return func(*args, **kwargs) 2022-11-23T02:48:20.6038749Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6038865Z self.run_subtests( 2022-11-23T02:48:20.6039228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6039402Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6039773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6039931Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6040320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6040428Z output = model(*input) 2022-11-23T02:48:20.6040758Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6040902Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6041290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6041473Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6041854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6041983Z _lazy_init(state, module) 2022-11-23T02:48:20.6042347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6042480Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6042830Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6042960Z return func(*args, **kwargs) 2022-11-23T02:48:20.6043347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6043451Z p_assert( 2022-11-23T02:48:20.6043797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6044013Z traceback.print_stack() 2022-11-23T02:48:20.6044145Z File "", line 1, in 2022-11-23T02:48:20.6044343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6044489Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6044695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6044854Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6045071Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6045178Z self.run() 2022-11-23T02:48:20.6045384Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6045516Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6045868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6046004Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6046381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6046507Z getattr(self, test_name)() 2022-11-23T02:48:20.6046875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6046976Z fn() 2022-11-23T02:48:20.6047396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6047515Z test(self, **param_kwargs) 2022-11-23T02:48:20.6047882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6048012Z return func(*args, **kwargs) 2022-11-23T02:48:20.6048315Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6048433Z self.run_subtests( 2022-11-23T02:48:20.6048801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6048972Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6049346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6049490Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6049876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6050001Z output = model(*input) 2022-11-23T02:48:20.6050335Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6050481Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6050867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6051055Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6051428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6051555Z _lazy_init(state, module) 2022-11-23T02:48:20.6051903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6052052Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6052398Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6052528Z return func(*args, **kwargs) 2022-11-23T02:48:20.6052916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6053021Z p_assert( 2022-11-23T02:48:20.6053367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6053544Z traceback.print_stack() 2022-11-23T02:48:20.6053678Z File "", line 1, in 2022-11-23T02:48:20.6053894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6054041Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6054253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6054407Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6054623Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6054731Z self.run() 2022-11-23T02:48:20.6054918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6055067Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6055416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6055556Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6055927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6056054Z getattr(self, test_name)() 2022-11-23T02:48:20.6056420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6056522Z fn() 2022-11-23T02:48:20.6056928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6057066Z test(self, **param_kwargs) 2022-11-23T02:48:20.6057432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6057560Z return func(*args, **kwargs) 2022-11-23T02:48:20.6057862Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6057983Z self.run_subtests( 2022-11-23T02:48:20.6058346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6058512Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6058864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6059025Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6059410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6059534Z output = model(*input) 2022-11-23T02:48:20.6059868Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6060012Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6060395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6060583Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6060940Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6061065Z _lazy_init(state, module) 2022-11-23T02:48:20.6061430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6061579Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6061923Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6062052Z return func(*args, **kwargs) 2022-11-23T02:48:20.6062434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6062539Z p_assert( 2022-11-23T02:48:20.6062865Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6063062Z traceback.print_stack() 2022-11-23T02:48:20.6063192Z File "", line 1, in 2022-11-23T02:48:20.6063405Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6063552Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6063759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6063914Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6064112Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6064217Z self.run() 2022-11-23T02:48:20.6064422Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6064573Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6064922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6065063Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6065433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6065560Z getattr(self, test_name)() 2022-11-23T02:48:20.6065957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6066067Z fn() 2022-11-23T02:48:20.6066440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6066565Z test(self, **param_kwargs) 2022-11-23T02:48:20.6066927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6067056Z return func(*args, **kwargs) 2022-11-23T02:48:20.6067360Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6067485Z self.run_subtests( 2022-11-23T02:48:20.6067829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6067998Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6068373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6068532Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6068917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6069038Z output = model(*input) 2022-11-23T02:48:20.6069369Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6069517Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6069888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6070077Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6070453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6070579Z _lazy_init(state, module) 2022-11-23T02:48:20.6070943Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6071089Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6071435Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6071563Z return func(*args, **kwargs) 2022-11-23T02:48:20.6071932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6072038Z p_assert( 2022-11-23T02:48:20.6072455Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6072584Z traceback.print_stack() 2022-11-23T02:48:20.6083763Z File "", line 1, in 2022-11-23T02:48:20.6084048Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6084363Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6084585Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6084730Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6084950Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6085046Z self.run() 2022-11-23T02:48:20.6085256Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6085401Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6085808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6085945Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6086328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6086452Z getattr(self, test_name)() 2022-11-23T02:48:20.6086974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6087090Z fn() 2022-11-23T02:48:20.6087492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6087617Z test(self, **param_kwargs) 2022-11-23T02:48:20.6088012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6088149Z return func(*args, **kwargs) 2022-11-23T02:48:20.6088456Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6088586Z self.run_subtests( 2022-11-23T02:48:20.6088975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6089153Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6089564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6089733Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6090148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6090276Z output = model(*input) 2022-11-23T02:48:20.6090618Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6090774Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6091197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6091388Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6091797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6091932Z _lazy_init(state, module) 2022-11-23T02:48:20.6092320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6092473Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6092830Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6092966Z return func(*args, **kwargs) 2022-11-23T02:48:20.6093384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6093587Z p_assert( 2022-11-23T02:48:20.6093965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6094104Z traceback.print_stack() 2022-11-23T02:48:20.6094242Z File "", line 1, in 2022-11-23T02:48:20.6094471Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6094609Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6094833Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6094995Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6095228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6095338Z self.run() 2022-11-23T02:48:20.6095560Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6095712Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6096091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6096216Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6096613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6096746Z getattr(self, test_name)() 2022-11-23T02:48:20.6097199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6097313Z fn() 2022-11-23T02:48:20.6097724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6097855Z test(self, **param_kwargs) 2022-11-23T02:48:20.6098230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6098366Z return func(*args, **kwargs) 2022-11-23T02:48:20.6098698Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6098821Z self.run_subtests( 2022-11-23T02:48:20.6099212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6099397Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6099803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6099972Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6100386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6100498Z output = model(*input) 2022-11-23T02:48:20.6100855Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6101012Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6101429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6101621Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6102028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6102160Z _lazy_init(state, module) 2022-11-23T02:48:20.6102546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6102680Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6103053Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6103187Z return func(*args, **kwargs) 2022-11-23T02:48:20.6103605Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6103801Z p_assert( 2022-11-23T02:48:20.6104212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6104458Z traceback.print_stack() 2022-11-23T02:48:20.6104703Z File "", line 1, in 2022-11-23T02:48:20.6104956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6105106Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6105322Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6105476Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6105697Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6105805Z self.run() 2022-11-23T02:48:20.6106013Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6106153Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6106521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6106659Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6107033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6107231Z getattr(self, test_name)() 2022-11-23T02:48:20.6107623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6107728Z fn() 2022-11-23T02:48:20.6108101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6108211Z test(self, **param_kwargs) 2022-11-23T02:48:20.6108578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6108715Z return func(*args, **kwargs) 2022-11-23T02:48:20.6109019Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6109137Z self.run_subtests( 2022-11-23T02:48:20.6109503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6109678Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6110056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6110196Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6110584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6110708Z output = model(*input) 2022-11-23T02:48:20.6111049Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6111201Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6111593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6111952Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6112440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6112555Z _lazy_init(state, module) 2022-11-23T02:48:20.6112922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6113070Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6113421Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6113551Z return func(*args, **kwargs) 2022-11-23T02:48:20.6113937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6114120Z p_assert( 2022-11-23T02:48:20.6114470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6114581Z traceback.print_stack() 2022-11-23T02:48:20.6114716Z File "", line 1, in 2022-11-23T02:48:20.6114938Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6115347Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6115568Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6115726Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6115943Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6116051Z self.run() 2022-11-23T02:48:20.6116244Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6116399Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6116758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6116893Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6117349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6117494Z getattr(self, test_name)() 2022-11-23T02:48:20.6117869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6117952Z fn() 2022-11-23T02:48:20.6118328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6118456Z test(self, **param_kwargs) 2022-11-23T02:48:20.6118819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6118953Z return func(*args, **kwargs) 2022-11-23T02:48:20.6119256Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6119374Z self.run_subtests( 2022-11-23T02:48:20.6119740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6119890Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6120264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6120420Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6120803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6120923Z output = model(*input) 2022-11-23T02:48:20.6121261Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6121412Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6121797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6121978Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6122338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6122465Z _lazy_init(state, module) 2022-11-23T02:48:20.6122820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6122967Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6123311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6123442Z return func(*args, **kwargs) 2022-11-23T02:48:20.6123925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6124041Z p_assert( 2022-11-23T02:48:20.6124371Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6124503Z traceback.print_stack() 2022-11-23T02:48:20.6124640Z File "", line 1, in 2022-11-23T02:48:20.6124853Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6124999Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6125203Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6125356Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6125553Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6125658Z self.run() 2022-11-23T02:48:20.6125866Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6126020Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6126373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6126510Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6126932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6127074Z getattr(self, test_name)() 2022-11-23T02:48:20.6127427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6127531Z fn() 2022-11-23T02:48:20.6127904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6128033Z test(self, **param_kwargs) 2022-11-23T02:48:20.6128397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6128533Z return func(*args, **kwargs) 2022-11-23T02:48:20.6128837Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6128956Z self.run_subtests( 2022-11-23T02:48:20.6129300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6129467Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6129844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6130000Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6130382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6130502Z output = model(*input) 2022-11-23T02:48:20.6130842Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6130987Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6131356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6131538Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6131918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6132041Z _lazy_init(state, module) 2022-11-23T02:48:20.6132400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6132547Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6132890Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6133088Z return func(*args, **kwargs) 2022-11-23T02:48:20.6133463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6133569Z p_assert( 2022-11-23T02:48:20.6133915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6134049Z traceback.print_stack() 2022-11-23T02:48:20.6134183Z File "", line 1, in 2022-11-23T02:48:20.6134397Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6134543Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6134730Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6134884Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6135099Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6135209Z self.run() 2022-11-23T02:48:20.6135417Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6135563Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6135914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6136053Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6136455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6136594Z getattr(self, test_name)() 2022-11-23T02:48:20.6136967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6137071Z fn() 2022-11-23T02:48:20.6137446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6137572Z test(self, **param_kwargs) 2022-11-23T02:48:20.6137942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6138072Z return func(*args, **kwargs) 2022-11-23T02:48:20.6138356Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6138473Z self.run_subtests( 2022-11-23T02:48:20.6138833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6139000Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6139372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6139529Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6139914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6140044Z output = model(*input) 2022-11-23T02:48:20.6140362Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6140508Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6140891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6141074Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6141453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6141580Z _lazy_init(state, module) 2022-11-23T02:48:20.6141937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6142083Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6142412Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6142615Z return func(*args, **kwargs) 2022-11-23T02:48:20.6143013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6143124Z p_assert( 2022-11-23T02:48:20.6143472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6143605Z traceback.print_stack() 2022-11-23T02:48:20.6143737Z File "", line 1, in 2022-11-23T02:48:20.6143954Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6144082Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6144289Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6144445Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6144662Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6144775Z self.run() 2022-11-23T02:48:20.6144983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6145133Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6145464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6145655Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6146041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6146171Z getattr(self, test_name)() 2022-11-23T02:48:20.6146538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6146642Z fn() 2022-11-23T02:48:20.6147017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6147142Z test(self, **param_kwargs) 2022-11-23T02:48:20.6147496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6147625Z return func(*args, **kwargs) 2022-11-23T02:48:20.6147927Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6148048Z self.run_subtests( 2022-11-23T02:48:20.6148411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6148579Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6148949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6149104Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6149469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6149599Z output = model(*input) 2022-11-23T02:48:20.6149931Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6150076Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6150467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6150650Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6151027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6151150Z _lazy_init(state, module) 2022-11-23T02:48:20.6151505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6151636Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6151982Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6152174Z return func(*args, **kwargs) 2022-11-23T02:48:20.6152568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6152677Z p_assert( 2022-11-23T02:48:20.6153030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6153163Z traceback.print_stack() 2022-11-23T02:48:20.6153279Z File "", line 1, in 2022-11-23T02:48:20.6153492Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6153635Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6153841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6153993Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6154209Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6154318Z self.run() 2022-11-23T02:48:20.6154524Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6154656Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6155008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6155582Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6155988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6156119Z getattr(self, test_name)() 2022-11-23T02:48:20.6156486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6156589Z fn() 2022-11-23T02:48:20.6156960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6157079Z test(self, **param_kwargs) 2022-11-23T02:48:20.6157443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6157575Z return func(*args, **kwargs) 2022-11-23T02:48:20.6157884Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6158008Z self.run_subtests( 2022-11-23T02:48:20.6158365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6158530Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6158904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6159043Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6159430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6159561Z output = model(*input) 2022-11-23T02:48:20.6159894Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6160037Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6160426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6160608Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6160983Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6161093Z _lazy_init(state, module) 2022-11-23T02:48:20.6161451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6161597Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6162030Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6162161Z return func(*args, **kwargs) 2022-11-23T02:48:20.6162552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6162660Z p_assert( 2022-11-23T02:48:20.6163004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6163118Z traceback.print_stack() 2022-11-23T02:48:20.6163250Z File "", line 1, in 2022-11-23T02:48:20.6163462Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6163606Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6163809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6163962Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6164184Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6164273Z self.run() 2022-11-23T02:48:20.6164481Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6164630Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6165027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6165178Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6165550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6165679Z getattr(self, test_name)() 2022-11-23T02:48:20.6166040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6166125Z fn() 2022-11-23T02:48:20.6166497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6166629Z test(self, **param_kwargs) 2022-11-23T02:48:20.6166999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6167129Z return func(*args, **kwargs) 2022-11-23T02:48:20.6167435Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6167555Z self.run_subtests( 2022-11-23T02:48:20.6167914Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6168064Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6168437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6168596Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6168986Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6169111Z output = model(*input) 2022-11-23T02:48:20.6169445Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6169588Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6169976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6170140Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6170516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6170638Z _lazy_init(state, module) 2022-11-23T02:48:20.6171037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6171264Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6171616Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6171748Z return func(*args, **kwargs) 2022-11-23T02:48:20.6172135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6172228Z p_assert( 2022-11-23T02:48:20.6172580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6172709Z traceback.print_stack() 2022-11-23T02:48:20.6172840Z File "", line 1, in 2022-11-23T02:48:20.6173055Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6173199Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6173452Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6173611Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6173810Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6173919Z self.run() 2022-11-23T02:48:20.6174129Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6174278Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6174684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6174836Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6175209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6175318Z getattr(self, test_name)() 2022-11-23T02:48:20.6175684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6175786Z fn() 2022-11-23T02:48:20.6176163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6176291Z test(self, **param_kwargs) 2022-11-23T02:48:20.6176653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6176785Z return func(*args, **kwargs) 2022-11-23T02:48:20.6177091Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6177191Z self.run_subtests( 2022-11-23T02:48:20.6177550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6177716Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6178087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6178248Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6178633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6178757Z output = model(*input) 2022-11-23T02:48:20.6179090Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6179222Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6179608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6179791Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6180165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6180294Z _lazy_init(state, module) 2022-11-23T02:48:20.6180654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6180933Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6181287Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6181417Z return func(*args, **kwargs) 2022-11-23T02:48:20.6181790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6181901Z p_assert( 2022-11-23T02:48:20.6182249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6182378Z traceback.print_stack() 2022-11-23T02:48:20.6182510Z File "", line 1, in 2022-11-23T02:48:20.6182722Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6182868Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6183056Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6183218Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6183436Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6183541Z self.run() 2022-11-23T02:48:20.6183747Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6183950Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6184314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6184454Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6184804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6184931Z getattr(self, test_name)() 2022-11-23T02:48:20.6185296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6185402Z fn() 2022-11-23T02:48:20.6185773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6185900Z test(self, **param_kwargs) 2022-11-23T02:48:20.6186266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6186401Z return func(*args, **kwargs) 2022-11-23T02:48:20.6186687Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6186803Z self.run_subtests( 2022-11-23T02:48:20.6187161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6187327Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6187695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6187859Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6188244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6188368Z output = model(*input) 2022-11-23T02:48:20.6188685Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6188831Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6189215Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6189394Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6189772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6189899Z _lazy_init(state, module) 2022-11-23T02:48:20.6190257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6190471Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6190805Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6190936Z return func(*args, **kwargs) 2022-11-23T02:48:20.6191328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6191436Z p_assert( 2022-11-23T02:48:20.6191781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6191913Z traceback.print_stack() 2022-11-23T02:48:20.6192044Z File "", line 1, in 2022-11-23T02:48:20.6192259Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6192389Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6192601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6192755Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6192970Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6193077Z self.run() 2022-11-23T02:48:20.6193282Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6193484Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6193827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6193964Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6194333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6194464Z getattr(self, test_name)() 2022-11-23T02:48:20.6194833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6194941Z fn() 2022-11-23T02:48:20.6195504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6195630Z test(self, **param_kwargs) 2022-11-23T02:48:20.6195978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6196110Z return func(*args, **kwargs) 2022-11-23T02:48:20.6196412Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6196530Z self.run_subtests( 2022-11-23T02:48:20.6196893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6197058Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6197426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6197589Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6197957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6198084Z output = model(*input) 2022-11-23T02:48:20.6198418Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6198565Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6198947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6199130Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6199502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6199625Z _lazy_init(state, module) 2022-11-23T02:48:20.6200062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6200211Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6200557Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6200687Z return func(*args, **kwargs) 2022-11-23T02:48:20.6201076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6201185Z p_assert( 2022-11-23T02:48:20.6201529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6201658Z traceback.print_stack() 2022-11-23T02:48:20.6201771Z File "", line 1, in 2022-11-23T02:48:20.6201984Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6202127Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6202338Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6202491Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6202706Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6202816Z self.run() 2022-11-23T02:48:20.6203092Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6203240Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6203592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6203733Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6204102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6204232Z getattr(self, test_name)() 2022-11-23T02:48:20.6204598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6204707Z fn() 2022-11-23T02:48:20.6205063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6205191Z test(self, **param_kwargs) 2022-11-23T02:48:20.6205559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6205689Z return func(*args, **kwargs) 2022-11-23T02:48:20.6205991Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6206107Z self.run_subtests( 2022-11-23T02:48:20.6206514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6206687Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6207066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6207206Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6207590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6207714Z output = model(*input) 2022-11-23T02:48:20.6208052Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6208197Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6208580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6208760Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6209139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6209316Z _lazy_init(state, module) 2022-11-23T02:48:20.6209681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6209831Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6210180Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6210314Z return func(*args, **kwargs) 2022-11-23T02:48:20.6210704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6210810Z p_assert( 2022-11-23T02:48:20.6211156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6211268Z traceback.print_stack() 2022-11-23T02:48:20.6211399Z File "", line 1, in 2022-11-23T02:48:20.6211609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6211758Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6211965Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6212117Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6212334Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6212425Z self.run() 2022-11-23T02:48:20.6212685Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6212851Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6213203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6213338Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6213708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6213836Z getattr(self, test_name)() 2022-11-23T02:48:20.6214216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6214300Z fn() 2022-11-23T02:48:20.6214670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6214796Z test(self, **param_kwargs) 2022-11-23T02:48:20.6215164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6215295Z return func(*args, **kwargs) 2022-11-23T02:48:20.6215593Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6215710Z self.run_subtests( 2022-11-23T02:48:20.6216070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6216218Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6216593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6216749Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6217134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6217262Z output = model(*input) 2022-11-23T02:48:20.6217595Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6217739Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6218121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6218284Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6218659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6218849Z _lazy_init(state, module) 2022-11-23T02:48:20.6219217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6219366Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6219712Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6219841Z return func(*args, **kwargs) 2022-11-23T02:48:20.6220229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6220318Z p_assert( 2022-11-23T02:48:20.6220663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6220794Z traceback.print_stack() 2022-11-23T02:48:20.6220928Z File "", line 1, in 2022-11-23T02:48:20.6221146Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6221296Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6221501Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6221653Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6221853Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6222010Z self.run() 2022-11-23T02:48:20.6222230Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6222383Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6222739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6222878Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6223244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6223361Z getattr(self, test_name)() 2022-11-23T02:48:20.6223730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6223828Z fn() 2022-11-23T02:48:20.6224198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6224328Z test(self, **param_kwargs) 2022-11-23T02:48:20.6224697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6224827Z return func(*args, **kwargs) 2022-11-23T02:48:20.6225129Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6225227Z self.run_subtests( 2022-11-23T02:48:20.6225586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6225760Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6226135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6226291Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6226679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6226801Z output = model(*input) 2022-11-23T02:48:20.6227135Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6227264Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6227645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6227824Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6228199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6228391Z _lazy_init(state, module) 2022-11-23T02:48:20.6228760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6228908Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6229260Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6229374Z return func(*args, **kwargs) 2022-11-23T02:48:20.6229761Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6229869Z p_assert( 2022-11-23T02:48:20.6230212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6230339Z traceback.print_stack() 2022-11-23T02:48:20.6230472Z File "", line 1, in 2022-11-23T02:48:20.6230692Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6230838Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6231027Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6231180Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6231453Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6231574Z self.run() 2022-11-23T02:48:20.6231781Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6231932Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6232284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6232422Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6232773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6232908Z getattr(self, test_name)() 2022-11-23T02:48:20.6233272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6233373Z fn() 2022-11-23T02:48:20.6233744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6233876Z test(self, **param_kwargs) 2022-11-23T02:48:20.6234246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6234360Z return func(*args, **kwargs) 2022-11-23T02:48:20.6234664Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6234782Z self.run_subtests( 2022-11-23T02:48:20.6235336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6235518Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6235900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6236060Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6236447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6236575Z output = model(*input) 2022-11-23T02:48:20.6236892Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6237034Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6237416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6237596Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6238086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6238216Z _lazy_init(state, module) 2022-11-23T02:48:20.6238577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6238725Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6239055Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6239185Z return func(*args, **kwargs) 2022-11-23T02:48:20.6239568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6239674Z p_assert( 2022-11-23T02:48:20.6240018Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6240147Z traceback.print_stack() 2022-11-23T02:48:20.6240914Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.6241731Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.6242498Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.6243258Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.6244008Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.6244753Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.6244874Z dist init r=1, world=2 2022-11-23T02:48:20.6245195Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6245520Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6245837Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6246146Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6246453Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6246757Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6247129Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6247441Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6247751Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6248062Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6248370Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6248680Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6248778Z dist init r=0, world=2 2022-11-23T02:48:20.6249150Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6249484Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6249797Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6250110Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6250425Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6250731Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6251034Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6251338Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6251652Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6251964Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6252254Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6252562Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6252668Z ok (5.012s) 2022-11-23T02:48:20.6253045Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91141 2022-11-23T02:48:20.6253272Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91142 2022-11-23T02:48:20.6253669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6253914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6254312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6254510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6254874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6255055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6255441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6255631Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6255881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.6256139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.6256547Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6257003Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6257252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.6257467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.6258505Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6258630Z warnings.warn( 2022-11-23T02:48:20.6259658Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6259774Z warnings.warn( 2022-11-23T02:48:20.6259904Z File "", line 1, in 2022-11-23T02:48:20.6260117Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6260262Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6260475Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6260629Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6260832Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6260940Z self.run() 2022-11-23T02:48:20.6261145Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6261300Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6261657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6261796Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6262170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6262297Z getattr(self, test_name)() 2022-11-23T02:48:20.6262651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6262811Z fn() 2022-11-23T02:48:20.6263192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6263320Z test(self, **param_kwargs) 2022-11-23T02:48:20.6263684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6263818Z return func(*args, **kwargs) 2022-11-23T02:48:20.6264125Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6264240Z self.run_subtests( 2022-11-23T02:48:20.6264584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6264750Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6265120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6265281Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6265666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6265788Z output = model(*input) 2022-11-23T02:48:20.6266169Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6266328Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6266700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6266882Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6267254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6267374Z _lazy_init(state, module) 2022-11-23T02:48:20.6267735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6267882Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6268227Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6268356Z return func(*args, **kwargs) 2022-11-23T02:48:20.6268730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6268837Z p_assert( 2022-11-23T02:48:20.6269179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6269310Z traceback.print_stack() 2022-11-23T02:48:20.6269439Z File "", line 1, in 2022-11-23T02:48:20.6269651Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6269793Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6270003Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6270140Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6270355Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6270464Z self.run() 2022-11-23T02:48:20.6270670Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6270821Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6271176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6271310Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6271684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6271793Z getattr(self, test_name)() 2022-11-23T02:48:20.6272159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6272332Z fn() 2022-11-23T02:48:20.6272714Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6272840Z test(self, **param_kwargs) 2022-11-23T02:48:20.6273214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6273384Z return func(*args, **kwargs) 2022-11-23T02:48:20.6273675Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6273795Z self.run_subtests( 2022-11-23T02:48:20.6274160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6274319Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6274697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6274854Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6275432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6275562Z output = model(*input) 2022-11-23T02:48:20.6275987Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6276126Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6276516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6276701Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6277074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6277203Z _lazy_init(state, module) 2022-11-23T02:48:20.6277559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6277706Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6278048Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6278164Z return func(*args, **kwargs) 2022-11-23T02:48:20.6278547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6278654Z p_assert( 2022-11-23T02:48:20.6278998Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6279130Z traceback.print_stack() 2022-11-23T02:48:20.6279266Z File "", line 1, in 2022-11-23T02:48:20.6279479Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6279612Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6279817Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6279969Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6280185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6280290Z self.run() 2022-11-23T02:48:20.6280500Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6280650Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6280998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6281117Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6281487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6281612Z getattr(self, test_name)() 2022-11-23T02:48:20.6282067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6282170Z fn() 2022-11-23T02:48:20.6282545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6282675Z test(self, **param_kwargs) 2022-11-23T02:48:20.6283044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6283157Z return func(*args, **kwargs) 2022-11-23T02:48:20.6283458Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6283573Z self.run_subtests( 2022-11-23T02:48:20.6283934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6284101Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6284478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6284637Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6285021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6285176Z output = model(*input) 2022-11-23T02:48:20.6285524Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6285670Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6286052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6286235Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6286613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6286743Z _lazy_init(state, module) 2022-11-23T02:48:20.6286872Z File "", line 1, in 2022-11-23T02:48:20.6287212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6287355Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6287707Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6287837Z return func(*args, **kwargs) 2022-11-23T02:48:20.6288050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6288194Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6288584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6288689Z p_assert( 2022-11-23T02:48:20.6288880Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6289040Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6289390Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6289521Z traceback.print_stack() 2022-11-23T02:48:20.6289740Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6289849Z self.run() 2022-11-23T02:48:20.6290056Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6290188Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6290534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6290672Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6291043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6291239Z getattr(self, test_name)() 2022-11-23T02:48:20.6291614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6291719Z fn() 2022-11-23T02:48:20.6292090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6292200Z test(self, **param_kwargs) 2022-11-23T02:48:20.6292566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6292693Z return func(*args, **kwargs) 2022-11-23T02:48:20.6292993Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6293109Z self.run_subtests( 2022-11-23T02:48:20.6293468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6293642Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6294016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6294157Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6294593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6294728Z output = model(*input) 2022-11-23T02:48:20.6295065Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6295208Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6295592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6295777Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6296154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6296266Z _lazy_init(state, module) 2022-11-23T02:48:20.6296627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6296779Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6297125Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6297252Z return func(*args, **kwargs) 2022-11-23T02:48:20.6297635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6297742Z p_assert( 2022-11-23T02:48:20.6298089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6298202Z traceback.print_stack() 2022-11-23T02:48:20.6298334Z File "", line 1, in 2022-11-23T02:48:20.6298548Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6298692Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6298895Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6299048Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6299269Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6299376Z self.run() 2022-11-23T02:48:20.6299566Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6299714Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6300063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6300200Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6300578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6300783Z getattr(self, test_name)() 2022-11-23T02:48:20.6301154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6301257Z fn() 2022-11-23T02:48:20.6301612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6301748Z test(self, **param_kwargs) 2022-11-23T02:48:20.6302115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6302247Z return func(*args, **kwargs) 2022-11-23T02:48:20.6302548Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6302668Z self.run_subtests( 2022-11-23T02:48:20.6303029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6303202Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6303562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6303720Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6304150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6304285Z output = model(*input) 2022-11-23T02:48:20.6304623Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6304771Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6305159Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6305340Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6305706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6305830Z _lazy_init(state, module) 2022-11-23T02:48:20.6306191Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6306341Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6306692Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6306823Z return func(*args, **kwargs) 2022-11-23T02:48:20.6307211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6307316Z p_assert( 2022-11-23T02:48:20.6307641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6307771Z traceback.print_stack() 2022-11-23T02:48:20.6307908Z File "", line 1, in 2022-11-23T02:48:20.6308122Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6308268Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6308475Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6308628Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6308831Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6308937Z self.run() 2022-11-23T02:48:20.6309142Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6309291Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6309643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6309781Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6310146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6310339Z getattr(self, test_name)() 2022-11-23T02:48:20.6310696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6310801Z fn() 2022-11-23T02:48:20.6311179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6311308Z test(self, **param_kwargs) 2022-11-23T02:48:20.6311669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6311797Z return func(*args, **kwargs) 2022-11-23T02:48:20.6312095Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6312211Z self.run_subtests( 2022-11-23T02:48:20.6312552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6312719Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6313092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6313250Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6313681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6313816Z output = model(*input) 2022-11-23T02:48:20.6314156Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6314302Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6314668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6314853Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6315488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6315617Z _lazy_init(state, module) 2022-11-23T02:48:20.6315984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6316138Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6316485Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6316616Z return func(*args, **kwargs) 2022-11-23T02:48:20.6316982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6317089Z p_assert( 2022-11-23T02:48:20.6317435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6317572Z traceback.print_stack() 2022-11-23T02:48:20.6317704Z File "", line 1, in 2022-11-23T02:48:20.6317920Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6318066Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6318270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6318413Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6318630Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6318736Z self.run() 2022-11-23T02:48:20.6318944Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6319095Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6319446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6319585Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6320043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6320173Z getattr(self, test_name)() 2022-11-23T02:48:20.6320540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6320639Z fn() 2022-11-23T02:48:20.6320777Z File "", line 1, in 2022-11-23T02:48:20.6321147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6321276Z test(self, **param_kwargs) 2022-11-23T02:48:20.6321642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6321754Z return func(*args, **kwargs) 2022-11-23T02:48:20.6321971Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6322115Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6322423Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6322541Z self.run_subtests( 2022-11-23T02:48:20.6322749Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6322900Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6323392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6323558Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6323779Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6323887Z self.run() 2022-11-23T02:48:20.6324261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6324420Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6324638Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6324787Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6325172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6325280Z output = model(*input) 2022-11-23T02:48:20.6325632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6325769Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6326099Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6326244Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6326612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6326740Z getattr(self, test_name)() 2022-11-23T02:48:20.6327125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6327289Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6327655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6327753Z fn() 2022-11-23T02:48:20.6328131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6328258Z _lazy_init(state, module) 2022-11-23T02:48:20.6328629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6328752Z test(self, **param_kwargs) 2022-11-23T02:48:20.6329112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6329305Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6329676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6329805Z return func(*args, **kwargs) 2022-11-23T02:48:20.6330148Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6330278Z return func(*args, **kwargs) 2022-11-23T02:48:20.6330577Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6330693Z self.run_subtests( 2022-11-23T02:48:20.6331083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6331175Z p_assert( 2022-11-23T02:48:20.6331539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6331712Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6332055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6332183Z traceback.print_stack() 2022-11-23T02:48:20.6332556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6332758Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6333156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6333263Z output = model(*input) 2022-11-23T02:48:20.6333594Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6333742Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6334123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6334307Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6334682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6334807Z _lazy_init(state, module) 2022-11-23T02:48:20.6335168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6335301Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6335642Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6335769Z return func(*args, **kwargs) 2022-11-23T02:48:20.6336156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6336260Z p_assert( 2022-11-23T02:48:20.6336601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6336734Z traceback.print_stack() 2022-11-23T02:48:20.6336864Z File "", line 1, in 2022-11-23T02:48:20.6337061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6337204Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6337414Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6337568Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6337787Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6337892Z self.run() 2022-11-23T02:48:20.6338099Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6338231Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6338583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6338781Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6339156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6339285Z getattr(self, test_name)() 2022-11-23T02:48:20.6339649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6339753Z fn() 2022-11-23T02:48:20.6340126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6340236Z test(self, **param_kwargs) 2022-11-23T02:48:20.6340600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6340729Z return func(*args, **kwargs) 2022-11-23T02:48:20.6341029Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6341149Z self.run_subtests( 2022-11-23T02:48:20.6341511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6341677Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6342097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6342249Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6342634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6342759Z output = model(*input) 2022-11-23T02:48:20.6343095Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6343237Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6343621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6343810Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6344192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6344300Z _lazy_init(state, module) 2022-11-23T02:48:20.6344660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6344808Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6345153Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6345279Z return func(*args, **kwargs) 2022-11-23T02:48:20.6345672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6345779Z p_assert( 2022-11-23T02:48:20.6346124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6346237Z traceback.print_stack() 2022-11-23T02:48:20.6346368Z File "", line 1, in 2022-11-23T02:48:20.6346583Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6346727Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6346936Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6347093Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6347310Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6347399Z self.run() 2022-11-23T02:48:20.6347603Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6347753Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6348102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6348303Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6348677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6348804Z getattr(self, test_name)() 2022-11-23T02:48:20.6349176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6349261Z fn() 2022-11-23T02:48:20.6349631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6349753Z test(self, **param_kwargs) 2022-11-23T02:48:20.6350117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6350244Z return func(*args, **kwargs) 2022-11-23T02:48:20.6350541Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6350660Z self.run_subtests( 2022-11-23T02:48:20.6351019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6351167Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6351581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6351748Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6352135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6352260Z output = model(*input) 2022-11-23T02:48:20.6352590Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6352735Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6353120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6353286Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6353659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6353789Z _lazy_init(state, module) 2022-11-23T02:48:20.6354149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6354297Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6354642Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6354769Z return func(*args, **kwargs) 2022-11-23T02:48:20.6355366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6355468Z p_assert( 2022-11-23T02:48:20.6355823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6355952Z traceback.print_stack() 2022-11-23T02:48:20.6356087Z File "", line 1, in 2022-11-23T02:48:20.6356300Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6356448Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6356576Z File "", line 1, in 2022-11-23T02:48:20.6356784Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6356920Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6357136Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6357240Z self.run() 2022-11-23T02:48:20.6357451Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6357693Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6357903Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6358048Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6358237Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6358393Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6358750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6358892Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6359111Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6359218Z self.run() 2022-11-23T02:48:20.6359588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6359715Z getattr(self, test_name)() 2022-11-23T02:48:20.6359905Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6360056Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6360421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6360522Z fn() 2022-11-23T02:48:20.6360930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6361081Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6361460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6361588Z test(self, **param_kwargs) 2022-11-23T02:48:20.6361934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6362062Z getattr(self, test_name)() 2022-11-23T02:48:20.6362425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6362558Z return func(*args, **kwargs) 2022-11-23T02:48:20.6362925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6363020Z fn() 2022-11-23T02:48:20.6363324Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6363440Z self.run_subtests( 2022-11-23T02:48:20.6363797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6363925Z test(self, **param_kwargs) 2022-11-23T02:48:20.6364280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6364446Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6364817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6364950Z return func(*args, **kwargs) 2022-11-23T02:48:20.6365318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6365470Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6365753Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6365869Z self.run_subtests( 2022-11-23T02:48:20.6366255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6366377Z output = model(*input) 2022-11-23T02:48:20.6366738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6366907Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6367320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6367470Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6367825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6367986Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6368365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6368545Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6368923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6369047Z output = model(*input) 2022-11-23T02:48:20.6369419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6369554Z _lazy_init(state, module) 2022-11-23T02:48:20.6369874Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6370021Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6370430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6370588Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6370976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6371160Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6371505Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6371633Z return func(*args, **kwargs) 2022-11-23T02:48:20.6371988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6372120Z _lazy_init(state, module) 2022-11-23T02:48:20.6372512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6372619Z p_assert( 2022-11-23T02:48:20.6372980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6373130Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6373519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6373654Z traceback.print_stack() 2022-11-23T02:48:20.6373982Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6374111Z return func(*args, **kwargs) 2022-11-23T02:48:20.6374492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6374601Z p_assert( 2022-11-23T02:48:20.6374945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6375071Z traceback.print_stack() 2022-11-23T02:48:20.6375200Z File "", line 1, in 2022-11-23T02:48:20.6375423Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6375552Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6375759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6375910Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6376127Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6376231Z self.run() 2022-11-23T02:48:20.6376435Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6376653Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6376987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6377126Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6377497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6377629Z getattr(self, test_name)() 2022-11-23T02:48:20.6377996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6378097Z fn() 2022-11-23T02:48:20.6378469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6378595Z test(self, **param_kwargs) 2022-11-23T02:48:20.6378939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6379072Z return func(*args, **kwargs) 2022-11-23T02:48:20.6379371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6379488Z self.run_subtests( 2022-11-23T02:48:20.6379897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6380075Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6380451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6380613Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6380727Z File "", line 1, in 2022-11-23T02:48:20.6381113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6381239Z output = model(*input) 2022-11-23T02:48:20.6381577Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6381720Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6381932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6382078Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6382466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6382632Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6382840Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6382993Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6383367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6383493Z _lazy_init(state, module) 2022-11-23T02:48:20.6383710Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6383818Z self.run() 2022-11-23T02:48:20.6384175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6384304Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6384515Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6384665Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6385013Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6385141Z return func(*args, **kwargs) 2022-11-23T02:48:20.6385486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6385622Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6386011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6386164Z p_assert( 2022-11-23T02:48:20.6386539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6386666Z getattr(self, test_name)() 2022-11-23T02:48:20.6387012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6387143Z traceback.print_stack() 2022-11-23T02:48:20.6387513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6387620Z fn() 2022-11-23T02:48:20.6387994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6388105Z test(self, **param_kwargs) 2022-11-23T02:48:20.6388468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6388604Z return func(*args, **kwargs) 2022-11-23T02:48:20.6388905Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6389021Z self.run_subtests( 2022-11-23T02:48:20.6389425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6389604Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6389981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6390120Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6390507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6390635Z output = model(*input) 2022-11-23T02:48:20.6390974Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6391118Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6391498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6391683Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6392057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6392164Z _lazy_init(state, module) 2022-11-23T02:48:20.6392521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6392672Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6393017Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6393147Z return func(*args, **kwargs) 2022-11-23T02:48:20.6393530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6393634Z p_assert( 2022-11-23T02:48:20.6393978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6394093Z traceback.print_stack() 2022-11-23T02:48:20.6394227Z File "", line 1, in 2022-11-23T02:48:20.6394360Z File "", line 1, in 2022-11-23T02:48:20.6394574Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6394718Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6394923Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6395311Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6395518Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6395755Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6395978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6396083Z self.run() 2022-11-23T02:48:20.6396286Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6396444Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6396650Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6396799Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6397001Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6397105Z self.run() 2022-11-23T02:48:20.6397466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6397603Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6397808Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6397960Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6398331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6398442Z getattr(self, test_name)() 2022-11-23T02:48:20.6398858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6399007Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6399380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6399483Z fn() 2022-11-23T02:48:20.6399846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6399970Z getattr(self, test_name)() 2022-11-23T02:48:20.6400343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6400459Z test(self, **param_kwargs) 2022-11-23T02:48:20.6400821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6400922Z fn() 2022-11-23T02:48:20.6401288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6401419Z return func(*args, **kwargs) 2022-11-23T02:48:20.6401787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6401913Z test(self, **param_kwargs) 2022-11-23T02:48:20.6402214Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6402314Z self.run_subtests( 2022-11-23T02:48:20.6402677Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6402810Z return func(*args, **kwargs) 2022-11-23T02:48:20.6403170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6403337Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6403638Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6403756Z self.run_subtests( 2022-11-23T02:48:20.6404127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6404269Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6404627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6404791Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6405246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6405368Z output = model(*input) 2022-11-23T02:48:20.6405743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6405904Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6406237Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6406364Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6406751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6406876Z output = model(*input) 2022-11-23T02:48:20.6407260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6407446Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6407777Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6407919Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6408339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6408478Z _lazy_init(state, module) 2022-11-23T02:48:20.6408849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6409031Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6409390Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6409536Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6409907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6410038Z _lazy_init(state, module) 2022-11-23T02:48:20.6410382Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6410511Z return func(*args, **kwargs) 2022-11-23T02:48:20.6410854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6411001Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6411387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6411493Z p_assert( 2022-11-23T02:48:20.6411838Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6411967Z return func(*args, **kwargs) 2022-11-23T02:48:20.6412310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6412445Z traceback.print_stack() 2022-11-23T02:48:20.6412824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6412927Z p_assert( 2022-11-23T02:48:20.6413264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6413391Z traceback.print_stack() 2022-11-23T02:48:20.6413523Z File "", line 1, in 2022-11-23T02:48:20.6413737Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6413879Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6414069Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6414224Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6414439Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6414617Z self.run() 2022-11-23T02:48:20.6414822Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6414972Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6415324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6415465Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6415818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6415947Z getattr(self, test_name)() 2022-11-23T02:48:20.6416314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6416414Z fn() 2022-11-23T02:48:20.6416782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6416914Z test(self, **param_kwargs) 2022-11-23T02:48:20.6417279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6417408Z return func(*args, **kwargs) 2022-11-23T02:48:20.6417746Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6417873Z self.run_subtests( 2022-11-23T02:48:20.6418238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6418406Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6418776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6418933Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6419313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6419440Z output = model(*input) 2022-11-23T02:48:20.6419759Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6419909Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6420298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6420480Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6420857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6420982Z _lazy_init(state, module) 2022-11-23T02:48:20.6421341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6421486Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6421818Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6421948Z return func(*args, **kwargs) 2022-11-23T02:48:20.6422331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6422439Z p_assert( 2022-11-23T02:48:20.6422784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6422917Z traceback.print_stack() 2022-11-23T02:48:20.6423049Z File "", line 1, in 2022-11-23T02:48:20.6423263Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6423391Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6423593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6423745Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6424031Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6424141Z self.run() 2022-11-23T02:48:20.6424349Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6424500Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6424841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6424980Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6425352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6425481Z getattr(self, test_name)() 2022-11-23T02:48:20.6425845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6425947Z fn() 2022-11-23T02:48:20.6426317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6426453Z test(self, **param_kwargs) 2022-11-23T02:48:20.6426801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6426932Z return func(*args, **kwargs) 2022-11-23T02:48:20.6427282Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6427412Z self.run_subtests( 2022-11-23T02:48:20.6427776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6427943Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6428312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6428466Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6428839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6428964Z output = model(*input) 2022-11-23T02:48:20.6429300Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6429444Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6429835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6430018Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6430394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6430519Z _lazy_init(state, module) 2022-11-23T02:48:20.6430860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6431011Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6431351Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6431478Z return func(*args, **kwargs) 2022-11-23T02:48:20.6431868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6431978Z p_assert( 2022-11-23T02:48:20.6432317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6432449Z traceback.print_stack() 2022-11-23T02:48:20.6432561Z File "", line 1, in 2022-11-23T02:48:20.6432692Z File "", line 1, in 2022-11-23T02:48:20.6432905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6433050Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6433254Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6433486Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6433700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6433844Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6434045Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6434154Z self.run() 2022-11-23T02:48:20.6434359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6434512Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6434719Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6434867Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6435256Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6435354Z self.run() 2022-11-23T02:48:20.6435716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6435858Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6436065Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6436214Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6436663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6436806Z getattr(self, test_name)() 2022-11-23T02:48:20.6437158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6437277Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6437644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6437745Z fn() 2022-11-23T02:48:20.6438116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6438248Z getattr(self, test_name)() 2022-11-23T02:48:20.6438620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6438747Z test(self, **param_kwargs) 2022-11-23T02:48:20.6439115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6439200Z fn() 2022-11-23T02:48:20.6439564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6439688Z return func(*args, **kwargs) 2022-11-23T02:48:20.6440056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6440180Z test(self, **param_kwargs) 2022-11-23T02:48:20.6440481Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6440604Z self.run_subtests( 2022-11-23T02:48:20.6440971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6441083Z return func(*args, **kwargs) 2022-11-23T02:48:20.6441449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6441616Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6441913Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6442028Z self.run_subtests( 2022-11-23T02:48:20.6442401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6442556Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6442996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6443145Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6443534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6443661Z output = model(*input) 2022-11-23T02:48:20.6444033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6444186Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6444517Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6444662Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6445044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6445156Z output = model(*input) 2022-11-23T02:48:20.6445539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6445719Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6446101Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6446258Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6446638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6446764Z _lazy_init(state, module) 2022-11-23T02:48:20.6447144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6447306Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6447663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6447813Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6448182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6448304Z _lazy_init(state, module) 2022-11-23T02:48:20.6448646Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6448775Z return func(*args, **kwargs) 2022-11-23T02:48:20.6449129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6449257Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6449648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6449754Z p_assert( 2022-11-23T02:48:20.6450099Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6450232Z return func(*args, **kwargs) 2022-11-23T02:48:20.6450574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6450706Z traceback.print_stack() 2022-11-23T02:48:20.6451096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6451187Z p_assert( 2022-11-23T02:48:20.6451523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6451651Z traceback.print_stack() 2022-11-23T02:48:20.6451782Z File "", line 1, in 2022-11-23T02:48:20.6451992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6452138Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6452409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6452546Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6452768Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6452877Z self.run() 2022-11-23T02:48:20.6453086Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6453242Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6453590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6453728Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6454097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6454207Z getattr(self, test_name)() 2022-11-23T02:48:20.6454570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6454677Z fn() 2022-11-23T02:48:20.6455050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6455179Z test(self, **param_kwargs) 2022-11-23T02:48:20.6455541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6455717Z return func(*args, **kwargs) 2022-11-23T02:48:20.6456027Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6456125Z self.run_subtests( 2022-11-23T02:48:20.6456487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6456654Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6457025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6457187Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6457571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6457696Z output = model(*input) 2022-11-23T02:48:20.6458033Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6458163Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6458551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6458733Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6459109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6459232Z _lazy_init(state, module) 2022-11-23T02:48:20.6459596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6459745Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6460088Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6460198Z return func(*args, **kwargs) 2022-11-23T02:48:20.6460589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6460701Z p_assert( 2022-11-23T02:48:20.6461051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6461182Z traceback.print_stack() 2022-11-23T02:48:20.6461319Z File "", line 1, in 2022-11-23T02:48:20.6461537Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6461679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6461932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6462087Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6462303Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6462412Z self.run() 2022-11-23T02:48:20.6462622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6462776Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6463123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6463241Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6463609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6463736Z getattr(self, test_name)() 2022-11-23T02:48:20.6464101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6464209Z fn() 2022-11-23T02:48:20.6464581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6464709Z test(self, **param_kwargs) 2022-11-23T02:48:20.6465117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6465237Z return func(*args, **kwargs) 2022-11-23T02:48:20.6465543Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6465661Z self.run_subtests( 2022-11-23T02:48:20.6466028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6466192Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6466566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6466730Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6467112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6467232Z output = model(*input) 2022-11-23T02:48:20.6467555Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6467703Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6468087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6468267Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6468642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6468771Z _lazy_init(state, module) 2022-11-23T02:48:20.6469127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6469273Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6469602Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6469734Z return func(*args, **kwargs) 2022-11-23T02:48:20.6470125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6470229Z p_assert( 2022-11-23T02:48:20.6470572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6470704Z traceback.print_stack() 2022-11-23T02:48:20.6470836Z File "", line 1, in 2022-11-23T02:48:20.6471033Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6471299Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6471508Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6471667Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6471801Z File "", line 1, in 2022-11-23T02:48:20.6472020Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6472130Z self.run() 2022-11-23T02:48:20.6472336Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6472467Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6472677Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6472817Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6473171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6473309Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6473560Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6473713Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6474068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6474194Z getattr(self, test_name)() 2022-11-23T02:48:20.6474460Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6474579Z self.run() 2022-11-23T02:48:20.6474948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6475211Z fn() 2022-11-23T02:48:20.6475433Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6475584Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6475944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6476076Z test(self, **param_kwargs) 2022-11-23T02:48:20.6476416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6476554Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6476923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6477052Z return func(*args, **kwargs) 2022-11-23T02:48:20.6477422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6477548Z getattr(self, test_name)() 2022-11-23T02:48:20.6477832Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6477948Z self.run_subtests( 2022-11-23T02:48:20.6478316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6478419Z fn() 2022-11-23T02:48:20.6478779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6478945Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6479320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6479449Z test(self, **param_kwargs) 2022-11-23T02:48:20.6479797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6479954Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6480322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6480446Z return func(*args, **kwargs) 2022-11-23T02:48:20.6480920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6481044Z output = model(*input) 2022-11-23T02:48:20.6481345Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6481462Z self.run_subtests( 2022-11-23T02:48:20.6481783Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6481929Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6482289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6482453Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6482833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6483018Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6483386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6483541Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6483959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6484100Z _lazy_init(state, module) 2022-11-23T02:48:20.6484486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6484610Z output = model(*input) 2022-11-23T02:48:20.6484974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6485121Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6485455Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6485604Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6485933Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6486062Z return func(*args, **kwargs) 2022-11-23T02:48:20.6486448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6486630Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6487017Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6487125Z p_assert( 2022-11-23T02:48:20.6487494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6487618Z _lazy_init(state, module) 2022-11-23T02:48:20.6487948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6488084Z traceback.print_stack() 2022-11-23T02:48:20.6488442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6488589Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6488940Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6489072Z return func(*args, **kwargs) 2022-11-23T02:48:20.6489461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6489564Z p_assert( 2022-11-23T02:48:20.6489885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6490015Z traceback.print_stack() 2022-11-23T02:48:20.6490126Z dist init r=1, world=2 2022-11-23T02:48:20.6490529Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6490860Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6491177Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6491492Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6491800Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6492101Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6492410Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6492743Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6493060Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6493367Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6493672Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6493981Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6494096Z dist init r=0, world=2 2022-11-23T02:48:20.6494429Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6494737Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6495038Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6495342Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6495653Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6495958Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6496249Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6496555Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6496862Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6497166Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6497551Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6497860Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6497963Z ok (5.112s) 2022-11-23T02:48:20.6498354Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91224 2022-11-23T02:48:20.6498582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91225 2022-11-23T02:48:20.6498975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6499142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6499534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6499731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6500155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6500344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6500736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6500934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6501184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.6501445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.6501837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6502243Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6502479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.6502712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.6503746Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6503869Z warnings.warn( 2022-11-23T02:48:20.6504892Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6505009Z warnings.warn( 2022-11-23T02:48:20.6505143Z File "", line 1, in 2022-11-23T02:48:20.6505361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6505507Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6505700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6505925Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6506146Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6506248Z self.run() 2022-11-23T02:48:20.6506452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6506655Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6507013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6507132Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6507501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6507627Z getattr(self, test_name)() 2022-11-23T02:48:20.6507995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6508100Z fn() 2022-11-23T02:48:20.6508472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6508596Z test(self, **param_kwargs) 2022-11-23T02:48:20.6508958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6509125Z return func(*args, **kwargs) 2022-11-23T02:48:20.6509442Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6509563Z self.run_subtests( 2022-11-23T02:48:20.6509926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6510095Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6510468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6510634Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6511020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6511125Z output = model(*input) 2022-11-23T02:48:20.6511460Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6511605Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6511992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6512170Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6512547Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6512672Z _lazy_init(state, module) 2022-11-23T02:48:20.6513026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6513178Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6513508Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6513637Z return func(*args, **kwargs) 2022-11-23T02:48:20.6514022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6514127Z p_assert( 2022-11-23T02:48:20.6514472Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6514604Z traceback.print_stack() 2022-11-23T02:48:20.6514733Z File "", line 1, in 2022-11-23T02:48:20.6514929Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6515243Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6515553Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6515711Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6515929Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6516036Z self.run() 2022-11-23T02:48:20.6516238Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6516390Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6516729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6516868Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6517235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6517362Z getattr(self, test_name)() 2022-11-23T02:48:20.6517725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6517831Z fn() 2022-11-23T02:48:20.6518201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6518327Z test(self, **param_kwargs) 2022-11-23T02:48:20.6518674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6518869Z return func(*args, **kwargs) 2022-11-23T02:48:20.6519186Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6519304Z self.run_subtests( 2022-11-23T02:48:20.6519669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6519834Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6520207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6520372Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6520740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6520866Z output = model(*input) 2022-11-23T02:48:20.6521203Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6521349Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6521733Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6521915Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6522284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6522410Z _lazy_init(state, module) 2022-11-23T02:48:20.6522754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6522903Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6523250Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6523375Z return func(*args, **kwargs) 2022-11-23T02:48:20.6523765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6523872Z p_assert( 2022-11-23T02:48:20.6524218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6524352Z traceback.print_stack() 2022-11-23T02:48:20.6524468Z File "", line 1, in 2022-11-23T02:48:20.6524682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6524825Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6525093Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6525248Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6525463Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6525570Z self.run() 2022-11-23T02:48:20.6525762Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6525913Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6526266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6526405Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6526774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6526897Z getattr(self, test_name)() 2022-11-23T02:48:20.6527258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6527366Z fn() 2022-11-23T02:48:20.6527726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6527854Z test(self, **param_kwargs) 2022-11-23T02:48:20.6528269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6528408Z return func(*args, **kwargs) 2022-11-23T02:48:20.6528713Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6528834Z self.run_subtests( 2022-11-23T02:48:20.6529194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6529359Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6529721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6529877Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6530255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6530379Z output = model(*input) 2022-11-23T02:48:20.6530715Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6530864Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6531248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6531427Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6531787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6531915Z _lazy_init(state, module) 2022-11-23T02:48:20.6532275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6532424Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6532767Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6532895Z return func(*args, **kwargs) 2022-11-23T02:48:20.6533356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6533549Z p_assert( 2022-11-23T02:48:20.6534039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6534196Z traceback.print_stack() 2022-11-23T02:48:20.6534327Z File "", line 1, in 2022-11-23T02:48:20.6534546Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6534781Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6534990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6535145Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6535362Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6535451Z self.run() 2022-11-23T02:48:20.6535667Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6535822Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6536362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6536773Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6537318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6537458Z getattr(self, test_name)() 2022-11-23T02:48:20.6537846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6537932Z fn() 2022-11-23T02:48:20.6538303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6538429Z test(self, **param_kwargs) 2022-11-23T02:48:20.6538872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6539022Z return func(*args, **kwargs) 2022-11-23T02:48:20.6539331Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6539447Z self.run_subtests( 2022-11-23T02:48:20.6539815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6539966Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6540353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6540510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6540901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6541027Z output = model(*input) 2022-11-23T02:48:20.6541367Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6541515Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6541919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6542085Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6542461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6542591Z _lazy_init(state, module) 2022-11-23T02:48:20.6542954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6543099Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6543449Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6543574Z return func(*args, **kwargs) 2022-11-23T02:48:20.6544167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6544259Z p_assert( 2022-11-23T02:48:20.6544744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6544874Z traceback.print_stack() 2022-11-23T02:48:20.6545002Z File "", line 1, in 2022-11-23T02:48:20.6545216Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6545430Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6545636Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6545789Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6545988Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6546096Z self.run() 2022-11-23T02:48:20.6546303Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6546451Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6546802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6546937Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6547308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6547423Z getattr(self, test_name)() 2022-11-23T02:48:20.6547786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6547888Z fn() 2022-11-23T02:48:20.6548266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6548390Z test(self, **param_kwargs) 2022-11-23T02:48:20.6548803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6548942Z return func(*args, **kwargs) 2022-11-23T02:48:20.6549245Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6549345Z self.run_subtests( 2022-11-23T02:48:20.6549704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6549876Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6550250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6550406Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6550796Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6551076Z output = model(*input) 2022-11-23T02:48:20.6551420Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6551642Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6552031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6552210Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6552581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6552707Z _lazy_init(state, module) 2022-11-23T02:48:20.6553067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6553212Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6553561Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6553688Z return func(*args, **kwargs) 2022-11-23T02:48:20.6554061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6554168Z p_assert( 2022-11-23T02:48:20.6554511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6554636Z traceback.print_stack() 2022-11-23T02:48:20.6554767Z File "", line 1, in 2022-11-23T02:48:20.6555344Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6555499Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6555689Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6555843Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6556063Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6556165Z self.run() 2022-11-23T02:48:20.6556373Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6556518Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6556870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6557007Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6557360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6557495Z getattr(self, test_name)() 2022-11-23T02:48:20.6557865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6557967Z fn() 2022-11-23T02:48:20.6558343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6558564Z test(self, **param_kwargs) 2022-11-23T02:48:20.6558949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6559078Z return func(*args, **kwargs) 2022-11-23T02:48:20.6559363Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6559481Z self.run_subtests( 2022-11-23T02:48:20.6559841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6560008Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6560382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6560539Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6560927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6561051Z output = model(*input) 2022-11-23T02:48:20.6561371Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6561518Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6561901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6562080Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6562459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6562582Z _lazy_init(state, module) 2022-11-23T02:48:20.6562936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6563082Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6563414Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6563543Z return func(*args, **kwargs) 2022-11-23T02:48:20.6563933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6564040Z p_assert( 2022-11-23T02:48:20.6564387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6564516Z traceback.print_stack() 2022-11-23T02:48:20.6564745Z File "", line 1, in 2022-11-23T02:48:20.6564960Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6565088Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6565293Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6565445Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6565666Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6565772Z self.run() 2022-11-23T02:48:20.6565978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6566125Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6566458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6566592Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6566958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6567089Z getattr(self, test_name)() 2022-11-23T02:48:20.6567458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6567551Z fn() 2022-11-23T02:48:20.6567978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6568114Z test(self, **param_kwargs) 2022-11-23T02:48:20.6568463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6568589Z return func(*args, **kwargs) 2022-11-23T02:48:20.6568892Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6569012Z self.run_subtests( 2022-11-23T02:48:20.6569376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6569548Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6569916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6570072Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6570445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6570569Z output = model(*input) 2022-11-23T02:48:20.6570898Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6571041Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6571427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6571607Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6571984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6572107Z _lazy_init(state, module) 2022-11-23T02:48:20.6572451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6572599Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6572946Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6573074Z return func(*args, **kwargs) 2022-11-23T02:48:20.6573503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6573610Z p_assert( 2022-11-23T02:48:20.6573956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6574157Z traceback.print_stack() 2022-11-23T02:48:20.6574272Z File "", line 1, in 2022-11-23T02:48:20.6574491Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6574630Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6574838Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6574997Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6575217Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6575324Z self.run() 2022-11-23T02:48:20.6575513Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6575663Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6576007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6576145Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6576520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6576647Z getattr(self, test_name)() 2022-11-23T02:48:20.6577058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6577157Z fn() 2022-11-23T02:48:20.6577564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6577703Z test(self, **param_kwargs) 2022-11-23T02:48:20.6578073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6578203Z return func(*args, **kwargs) 2022-11-23T02:48:20.6578504Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6578618Z self.run_subtests( 2022-11-23T02:48:20.6578988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6579157Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6579513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6579673Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6580052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6580174Z output = model(*input) 2022-11-23T02:48:20.6580507Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6580653Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6581034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6581220Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6581596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6581705Z _lazy_init(state, module) 2022-11-23T02:48:20.6582070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6582213Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6582558Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6582686Z return func(*args, **kwargs) 2022-11-23T02:48:20.6583072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6583180Z p_assert( 2022-11-23T02:48:20.6583528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6583706Z traceback.print_stack() 2022-11-23T02:48:20.6583838Z File "", line 1, in 2022-11-23T02:48:20.6584053Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6584196Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6584409Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6584563Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6584778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6584870Z self.run() 2022-11-23T02:48:20.6585068Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6585216Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6585566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6585708Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6586076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6586198Z getattr(self, test_name)() 2022-11-23T02:48:20.6586563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6586647Z fn() 2022-11-23T02:48:20.6587067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6587205Z test(self, **param_kwargs) 2022-11-23T02:48:20.6587575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6587703Z return func(*args, **kwargs) 2022-11-23T02:48:20.6588004Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6588126Z self.run_subtests( 2022-11-23T02:48:20.6588484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6588635Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6589006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6589168Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6589552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6589675Z output = model(*input) 2022-11-23T02:48:20.6590010Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6590155Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6590536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6590707Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6591083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6591206Z _lazy_init(state, module) 2022-11-23T02:48:20.6591565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6591712Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6592059Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6592187Z return func(*args, **kwargs) 2022-11-23T02:48:20.6592576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6592666Z p_assert( 2022-11-23T02:48:20.6593012Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6593209Z traceback.print_stack() 2022-11-23T02:48:20.6593343Z File "", line 1, in 2022-11-23T02:48:20.6593557Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6593705Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6593915Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6594054Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6594271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6594378Z self.run() 2022-11-23T02:48:20.6594582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6594733Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6595327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6595479Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6595858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6595968Z getattr(self, test_name)() 2022-11-23T02:48:20.6596422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6596535Z fn() 2022-11-23T02:48:20.6596913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6597037Z test(self, **param_kwargs) 2022-11-23T02:48:20.6597401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6597529Z return func(*args, **kwargs) 2022-11-23T02:48:20.6597828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6597932Z self.run_subtests( 2022-11-23T02:48:20.6598294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6598461Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6598836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6598992Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6599376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6599499Z output = model(*input) 2022-11-23T02:48:20.6599827Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6599957Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6600343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6600525Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6600900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6601022Z _lazy_init(state, module) 2022-11-23T02:48:20.6601385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6601531Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6601876Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6601987Z return func(*args, **kwargs) 2022-11-23T02:48:20.6602374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6602479Z p_assert( 2022-11-23T02:48:20.6602995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6603126Z traceback.print_stack() 2022-11-23T02:48:20.6603255Z File "", line 1, in 2022-11-23T02:48:20.6603468Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6603620Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6603811Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6603963Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6604177Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6604281Z self.run() 2022-11-23T02:48:20.6604486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6604635Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6604981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6605105Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6605476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6605606Z getattr(self, test_name)() 2022-11-23T02:48:20.6606021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6606132Z fn() 2022-11-23T02:48:20.6606510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6606632Z test(self, **param_kwargs) 2022-11-23T02:48:20.6606995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6607106Z return func(*args, **kwargs) 2022-11-23T02:48:20.6607405Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6607523Z self.run_subtests( 2022-11-23T02:48:20.6607887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6608053Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6608431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6608594Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6608977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6609103Z output = model(*input) 2022-11-23T02:48:20.6609421Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6609567Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6609955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6610136Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6610516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6610639Z _lazy_init(state, module) 2022-11-23T02:48:20.6611002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6611152Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6611483Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6611613Z return func(*args, **kwargs) 2022-11-23T02:48:20.6612003Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6612177Z p_assert( 2022-11-23T02:48:20.6612523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6612653Z traceback.print_stack() 2022-11-23T02:48:20.6612784Z File "", line 1, in 2022-11-23T02:48:20.6612982Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6613132Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6613337Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6613487Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6613703Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6613808Z self.run() 2022-11-23T02:48:20.6614010Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6614158Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6614493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6614631Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6615001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6615128Z getattr(self, test_name)() 2022-11-23T02:48:20.6615593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6615706Z fn() 2022-11-23T02:48:20.6616078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6616206Z test(self, **param_kwargs) 2022-11-23T02:48:20.6616555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6616685Z return func(*args, **kwargs) 2022-11-23T02:48:20.6616983Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6617099Z self.run_subtests( 2022-11-23T02:48:20.6617459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6617626Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6617999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6618153Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6618522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6618644Z output = model(*input) 2022-11-23T02:48:20.6618982Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6619135Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6619524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6619707Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6620086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6620212Z _lazy_init(state, module) 2022-11-23T02:48:20.6620554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6620700Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6621040Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6621169Z return func(*args, **kwargs) 2022-11-23T02:48:20.6621556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6621729Z p_assert( 2022-11-23T02:48:20.6622077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6622209Z traceback.print_stack() 2022-11-23T02:48:20.6622324Z File "", line 1, in 2022-11-23T02:48:20.6622543Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6622687Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6622894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6623045Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6623263Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6623367Z self.run() 2022-11-23T02:48:20.6623558Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6623713Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6624064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6624202Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6624571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6624746Z getattr(self, test_name)() 2022-11-23T02:48:20.6625124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6625225Z fn() 2022-11-23T02:48:20.6625581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6625707Z test(self, **param_kwargs) 2022-11-23T02:48:20.6626067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6626195Z return func(*args, **kwargs) 2022-11-23T02:48:20.6626502Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6626619Z self.run_subtests( 2022-11-23T02:48:20.6626979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6627146Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6627500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6627657Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6628035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6628157Z output = model(*input) 2022-11-23T02:48:20.6628489Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6628635Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6629019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6629195Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6629555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6629683Z _lazy_init(state, module) 2022-11-23T02:48:20.6630041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6630188Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6630532Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6630660Z return func(*args, **kwargs) 2022-11-23T02:48:20.6631047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6631228Z p_assert( 2022-11-23T02:48:20.6631563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6631690Z traceback.print_stack() 2022-11-23T02:48:20.6631822Z File "", line 1, in 2022-11-23T02:48:20.6632040Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6632183Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6632390Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6632542Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6632758Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6632849Z self.run() 2022-11-23T02:48:20.6633049Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6633202Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6633556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6633693Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6634116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6634255Z getattr(self, test_name)() 2022-11-23T02:48:20.6634608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6634711Z fn() 2022-11-23T02:48:20.6635302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6635441Z test(self, **param_kwargs) 2022-11-23T02:48:20.6635814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6635948Z return func(*args, **kwargs) 2022-11-23T02:48:20.6636249Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6636363Z self.run_subtests( 2022-11-23T02:48:20.6636710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6636880Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6637251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6637407Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6637791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6637916Z output = model(*input) 2022-11-23T02:48:20.6638251Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6638403Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6638786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6638952Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6639336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6639465Z _lazy_init(state, module) 2022-11-23T02:48:20.6639826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6639970Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6640310Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6640439Z return func(*args, **kwargs) 2022-11-23T02:48:20.6640924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6641013Z p_assert( 2022-11-23T02:48:20.6641354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6641478Z traceback.print_stack() 2022-11-23T02:48:20.6641611Z File "", line 1, in 2022-11-23T02:48:20.6641826Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6641967Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6642170Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6642308Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6642526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6642632Z self.run() 2022-11-23T02:48:20.6642840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6642991Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6643339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6643471Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6643898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6644021Z getattr(self, test_name)() 2022-11-23T02:48:20.6644394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6644493Z fn() 2022-11-23T02:48:20.6644866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6644991Z test(self, **param_kwargs) 2022-11-23T02:48:20.6645353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6645488Z return func(*args, **kwargs) 2022-11-23T02:48:20.6645790Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6645891Z self.run_subtests( 2022-11-23T02:48:20.6646254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6646420Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6646789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6646949Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6647328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6647448Z output = model(*input) 2022-11-23T02:48:20.6647785Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6647914Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6648301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6648482Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6648857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6648981Z _lazy_init(state, module) 2022-11-23T02:48:20.6649338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6649483Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6649828Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6650001Z return func(*args, **kwargs) 2022-11-23T02:48:20.6650389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6650495Z p_assert( 2022-11-23T02:48:20.6650835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6650970Z traceback.print_stack() 2022-11-23T02:48:20.6651100Z File "", line 1, in 2022-11-23T02:48:20.6651314Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6651463Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6651653Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6651806Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6652018Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6652127Z self.run() 2022-11-23T02:48:20.6652335Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6652481Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6652829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6652948Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6653362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6653498Z getattr(self, test_name)() 2022-11-23T02:48:20.6653871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6653972Z fn() 2022-11-23T02:48:20.6654343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6654469Z test(self, **param_kwargs) 2022-11-23T02:48:20.6654838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6654952Z return func(*args, **kwargs) 2022-11-23T02:48:20.6655253Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6655372Z self.run_subtests( 2022-11-23T02:48:20.6655729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6655896Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6656267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6656422Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6656805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6656918Z output = model(*input) 2022-11-23T02:48:20.6657252Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6657392Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6657773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6657955Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6658332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6658450Z _lazy_init(state, module) 2022-11-23T02:48:20.6658805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6658934Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6659283Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6659475Z return func(*args, **kwargs) 2022-11-23T02:48:20.6659871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6659977Z p_assert( 2022-11-23T02:48:20.6660326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6660456Z traceback.print_stack() 2022-11-23T02:48:20.6660585Z File "", line 1, in 2022-11-23T02:48:20.6660781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6660923Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6661125Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6661278Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6661494Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6661598Z self.run() 2022-11-23T02:48:20.6661800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6661951Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6662285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6662477Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6662859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6662987Z getattr(self, test_name)() 2022-11-23T02:48:20.6663349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6663449Z fn() 2022-11-23T02:48:20.6663819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6663929Z test(self, **param_kwargs) 2022-11-23T02:48:20.6664297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6664428Z return func(*args, **kwargs) 2022-11-23T02:48:20.6664728Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6664846Z self.run_subtests( 2022-11-23T02:48:20.6665206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6665366Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6665734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6665887Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6666256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6666388Z output = model(*input) 2022-11-23T02:48:20.6666726Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6666869Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6667253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6667432Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6667805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6667929Z _lazy_init(state, module) 2022-11-23T02:48:20.6668272Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6668421Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6668767Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6668959Z return func(*args, **kwargs) 2022-11-23T02:48:20.6669348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6669455Z p_assert( 2022-11-23T02:48:20.6669801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6669933Z traceback.print_stack() 2022-11-23T02:48:20.6670050Z File "", line 1, in 2022-11-23T02:48:20.6670261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6670408Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6670614Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6670770Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6670985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6671095Z self.run() 2022-11-23T02:48:20.6671286Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6671436Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6671784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6671970Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6672356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6672482Z getattr(self, test_name)() 2022-11-23T02:48:20.6672843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6672943Z fn() 2022-11-23T02:48:20.6673295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6673464Z test(self, **param_kwargs) 2022-11-23T02:48:20.6673832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6673957Z return func(*args, **kwargs) 2022-11-23T02:48:20.6674259Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6674377Z self.run_subtests( 2022-11-23T02:48:20.6674735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6674904Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6675601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6675761Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6676153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6676283Z output = model(*input) 2022-11-23T02:48:20.6676618Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6676762Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6677154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6677341Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6677704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6677829Z _lazy_init(state, module) 2022-11-23T02:48:20.6678185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6678330Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6678783Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6678913Z return func(*args, **kwargs) 2022-11-23T02:48:20.6679302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6679409Z p_assert( 2022-11-23T02:48:20.6679741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6679873Z traceback.print_stack() 2022-11-23T02:48:20.6680006Z File "", line 1, in 2022-11-23T02:48:20.6680222Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6680368Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6680576Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6680732Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6680955Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6681046Z self.run() 2022-11-23T02:48:20.6681252Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6681403Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6681823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6681974Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6682352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6682480Z getattr(self, test_name)() 2022-11-23T02:48:20.6682831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6682931Z fn() 2022-11-23T02:48:20.6683303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6683437Z test(self, **param_kwargs) 2022-11-23T02:48:20.6683806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6683936Z return func(*args, **kwargs) 2022-11-23T02:48:20.6684243Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6684363Z self.run_subtests( 2022-11-23T02:48:20.6684711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6684878Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6685249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6685408Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6685799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6685920Z output = model(*input) 2022-11-23T02:48:20.6686266Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6686394Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6686785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6686971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6687350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6687477Z _lazy_init(state, module) 2022-11-23T02:48:20.6687841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6688057Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6688405Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6688518Z return func(*args, **kwargs) 2022-11-23T02:48:20.6688900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6689011Z p_assert( 2022-11-23T02:48:20.6689360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6689488Z traceback.print_stack() 2022-11-23T02:48:20.6689617Z File "", line 1, in 2022-11-23T02:48:20.6689832Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6689978Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6690166Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6690325Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6690540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6690646Z self.run() 2022-11-23T02:48:20.6690852Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6691009Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6691420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6691551Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6691922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6692052Z getattr(self, test_name)() 2022-11-23T02:48:20.6692414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6692512Z fn() 2022-11-23T02:48:20.6692885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6693019Z test(self, **param_kwargs) 2022-11-23T02:48:20.6693387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6693499Z return func(*args, **kwargs) 2022-11-23T02:48:20.6693804Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6693924Z self.run_subtests( 2022-11-23T02:48:20.6694291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6694458Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6694836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6694999Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6695384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6695490Z output = model(*input) 2022-11-23T02:48:20.6695821Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6695973Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6696360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6696543Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6696917Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6697041Z _lazy_init(state, module) 2022-11-23T02:48:20.6697404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6697618Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6697967Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6698095Z return func(*args, **kwargs) 2022-11-23T02:48:20.6698489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6698596Z p_assert( 2022-11-23T02:48:20.6698946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6699079Z traceback.print_stack() 2022-11-23T02:48:20.6699208Z File "", line 1, in 2022-11-23T02:48:20.6699407Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6699552Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6699756Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6699916Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6700132Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6700238Z self.run() 2022-11-23T02:48:20.6700446Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6700629Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6700991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6701127Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6701501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6701634Z getattr(self, test_name)() 2022-11-23T02:48:20.6702005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6702112Z fn() 2022-11-23T02:48:20.6702485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6702597Z test(self, **param_kwargs) 2022-11-23T02:48:20.6702962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6703096Z return func(*args, **kwargs) 2022-11-23T02:48:20.6703403Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6703521Z self.run_subtests( 2022-11-23T02:48:20.6703881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6704050Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6704424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6704571Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6704955Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6705081Z output = model(*input) 2022-11-23T02:48:20.6705420Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6705567Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6705955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6706134Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6706510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6706636Z _lazy_init(state, module) 2022-11-23T02:48:20.6706979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6707189Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6707541Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6707669Z return func(*args, **kwargs) 2022-11-23T02:48:20.6708064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6708169Z p_assert( 2022-11-23T02:48:20.6708522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6708635Z traceback.print_stack() 2022-11-23T02:48:20.6708768Z File "", line 1, in 2022-11-23T02:48:20.6708983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6709130Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6709343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6709498Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6709716Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6709822Z self.run() 2022-11-23T02:48:20.6710013Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6710212Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6710576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6710714Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6711086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6711214Z getattr(self, test_name)() 2022-11-23T02:48:20.6711585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6711695Z fn() 2022-11-23T02:48:20.6712053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6712185Z test(self, **param_kwargs) 2022-11-23T02:48:20.6712552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6712686Z return func(*args, **kwargs) 2022-11-23T02:48:20.6712989Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6713106Z self.run_subtests( 2022-11-23T02:48:20.6713470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6713638Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6713995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6714158Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6714549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6714674Z output = model(*input) 2022-11-23T02:48:20.6715013Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6715478Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6715875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6716058Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6716417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6716544Z _lazy_init(state, module) 2022-11-23T02:48:20.6717001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6717148Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6717497Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6717626Z return func(*args, **kwargs) 2022-11-23T02:48:20.6718017Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6718127Z p_assert( 2022-11-23T02:48:20.6718459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6718594Z traceback.print_stack() 2022-11-23T02:48:20.6718728Z File "", line 1, in 2022-11-23T02:48:20.6718941Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6719087Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6719298Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6719454Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6719656Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6719762Z self.run() 2022-11-23T02:48:20.6720034Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6720199Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6720550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6720691Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6721062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6721187Z getattr(self, test_name)() 2022-11-23T02:48:20.6721539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6721648Z fn() 2022-11-23T02:48:20.6722023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6722151Z test(self, **param_kwargs) 2022-11-23T02:48:20.6722518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6722651Z return func(*args, **kwargs) 2022-11-23T02:48:20.6722955Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6723072Z self.run_subtests( 2022-11-23T02:48:20.6723416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6723585Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6723965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6724125Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6724509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6724633Z output = model(*input) 2022-11-23T02:48:20.6724972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6725120Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6725495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6725677Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6726049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6726251Z _lazy_init(state, module) 2022-11-23T02:48:20.6726617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6726765Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6727107Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6727242Z return func(*args, **kwargs) 2022-11-23T02:48:20.6727614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6727724Z p_assert( 2022-11-23T02:48:20.6728069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6728200Z traceback.print_stack() 2022-11-23T02:48:20.6728330Z File "", line 1, in 2022-11-23T02:48:20.6728542Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6728693Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6728896Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6729035Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6729249Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6729358Z self.run() 2022-11-23T02:48:20.6729622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6729780Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6730135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6730273Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6730629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6730758Z getattr(self, test_name)() 2022-11-23T02:48:20.6731136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6731233Z fn() 2022-11-23T02:48:20.6731605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6731730Z test(self, **param_kwargs) 2022-11-23T02:48:20.6732105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6732236Z return func(*args, **kwargs) 2022-11-23T02:48:20.6732520Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:48:20.6732632Z self.run_subtests( 2022-11-23T02:48:20.6732992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6733165Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6733541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6733698Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6734081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6734209Z output = model(*input) 2022-11-23T02:48:20.6734546Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6734673Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6735060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6735242Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6735618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6735806Z _lazy_init(state, module) 2022-11-23T02:48:20.6736170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6736317Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6736665Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6736780Z return func(*args, **kwargs) 2022-11-23T02:48:20.6737170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6737278Z p_assert( 2022-11-23T02:48:20.6737627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6737764Z traceback.print_stack() 2022-11-23T02:48:20.6737878Z dist init r=1, world=2 2022-11-23T02:48:20.6738214Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6738546Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6738897Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6739220Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6739526Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6739836Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6740150Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6740461Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6740768Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6741101Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6741422Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6741741Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.6741856Z dist init r=0, world=2 2022-11-23T02:48:20.6742151Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6742466Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6742778Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6743089Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6743454Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6743758Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6744068Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6744371Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6744680Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6744986Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6745292Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6745644Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.6745745Z ok (5.112s) 2022-11-23T02:48:20.6746095Z test_transformer_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91307 2022-11-23T02:48:20.6746320Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91308 2022-11-23T02:48:20.6746714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6746892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6747289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6747482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6747866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6748047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6748414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6748607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6748858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.6749106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.6749517Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6749929Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6750171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.6750409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.6750651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6750873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6751903Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6752083Z warnings.warn( 2022-11-23T02:48:20.6753114Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6753232Z warnings.warn( 2022-11-23T02:48:20.6753466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6753700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6753939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6754175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6754411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6754672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6754920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6755415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6755657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6755888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6756128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6756357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6756586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6756820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6757034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6757263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6757491Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6757719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6757946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6758182Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6758418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6758646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6758860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6759089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6759309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6759533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6759762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6759989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6760382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6760609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6760838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6761052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6761278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6761504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6761727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6761956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6762184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6762417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6762641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6762852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6763137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6763381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6763609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6763840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6764065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6764297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6764410Z dist init r=0, world=2 2022-11-23T02:48:20.6764507Z dist init r=1, world=2 2022-11-23T02:48:20.6764612Z ok (8.017s) 2022-11-23T02:48:20.6764961Z test_transformer_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91390 2022-11-23T02:48:20.6765189Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91391 2022-11-23T02:48:20.6765588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6765766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6766163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6766363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6766734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6766917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6767309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6767512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6767767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.6768017Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.6768425Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6768832Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6769135Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.6769351Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.6769590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6769831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6770863Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6770987Z warnings.warn( 2022-11-23T02:48:20.6772056Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6772180Z warnings.warn( 2022-11-23T02:48:20.6772416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6772651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6772885Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6773123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6773343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6773619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6773853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6774089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6774319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6774548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6774777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6775009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6775220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6775456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6775685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6775917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6776150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6776380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6776611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6776841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6777071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6777282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6777587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6777817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6778044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6778280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6778510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6778733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6778957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6779167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6779399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6779627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6779857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6780140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6780380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6780607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6780834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6781044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6781267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6781500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6781728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6781957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6782190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6782421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6782646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6782874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6782972Z dist init r=1, world=2 2022-11-23T02:48:20.6783082Z dist init r=0, world=2 2022-11-23T02:48:20.6783186Z ok (8.418s) 2022-11-23T02:48:20.6783543Z test_transformer_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91473 2022-11-23T02:48:20.6783773Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91474 2022-11-23T02:48:20.6784167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6784357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6784748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6784930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6785310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6785489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6785876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6786132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6786384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.6786639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.6787050Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6787457Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6787673Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.6787900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.6788143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6788384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6789468Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6789593Z warnings.warn( 2022-11-23T02:48:20.6790616Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6790736Z warnings.warn( 2022-11-23T02:48:20.6790972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6791215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6791450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6791670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6791899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6792129Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6792365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6792600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6792830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6793057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6793292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6793506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6793733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6793961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6794191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6794486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6794716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6794945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6795439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6795659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6795887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6796114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6796340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6796566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6796801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6797032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6797259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6797559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6797785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6798012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6798237Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6798461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6798688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6798926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6799153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6799379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6799596Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6799829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6800058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6800286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6800515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6800745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6800977Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6801206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6801419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6801652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6801768Z dist init r=0, world=2 2022-11-23T02:48:20.6801882Z dist init r=1, world=2 2022-11-23T02:48:20.6801979Z ok (8.217s) 2022-11-23T02:48:20.6802324Z test_transformer_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91556 2022-11-23T02:48:20.6802551Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91557 2022-11-23T02:48:20.6803021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6803190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6803579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6803784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6804162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.6804344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.6804735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.6804931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.6805182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.6805441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.6805835Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6806296Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.6806586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.6806815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.6807053Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6807292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6808329Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6808458Z warnings.warn( 2022-11-23T02:48:20.6809480Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.6809598Z warnings.warn( 2022-11-23T02:48:20.6809733Z File "", line 1, in 2022-11-23T02:48:20.6809935Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6810078Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6810287Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6810441Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6810661Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6810769Z self.run() 2022-11-23T02:48:20.6810975Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6811107Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6811465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6811599Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6811974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6812169Z getattr(self, test_name)() 2022-11-23T02:48:20.6812546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6812650Z fn() 2022-11-23T02:48:20.6813032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6813144Z test(self, **param_kwargs) 2022-11-23T02:48:20.6813512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6813644Z return func(*args, **kwargs) 2022-11-23T02:48:20.6813889Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6814009Z self.run_subtests( 2022-11-23T02:48:20.6814373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6814548Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6814925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6815069Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6815502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6815637Z output = model(*input) 2022-11-23T02:48:20.6815974Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6816118Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6816505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6816688Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6817074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6817185Z _lazy_init(state, module) 2022-11-23T02:48:20.6817549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6817700Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6818055Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6818184Z return func(*args, **kwargs) 2022-11-23T02:48:20.6818574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6818680Z p_assert( 2022-11-23T02:48:20.6819026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6819140Z traceback.print_stack() 2022-11-23T02:48:20.6819277Z File "", line 1, in 2022-11-23T02:48:20.6819493Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6819640Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6819850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6820004Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6820226Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6820334Z self.run() 2022-11-23T02:48:20.6820526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6820677Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6821033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6821169Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6821543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6821749Z getattr(self, test_name)() 2022-11-23T02:48:20.6822128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6822230Z fn() 2022-11-23T02:48:20.6822591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6822717Z test(self, **param_kwargs) 2022-11-23T02:48:20.6823081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6823212Z return func(*args, **kwargs) 2022-11-23T02:48:20.6823460Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6823579Z self.run_subtests( 2022-11-23T02:48:20.6823948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6824105Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6824480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6824635Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6825072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6825204Z output = model(*input) 2022-11-23T02:48:20.6825545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6825685Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6826075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6826258Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6826625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6826750Z _lazy_init(state, module) 2022-11-23T02:48:20.6827116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6827270Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6827619Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6827752Z return func(*args, **kwargs) 2022-11-23T02:48:20.6828140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6828248Z p_assert( 2022-11-23T02:48:20.6828578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6828709Z traceback.print_stack() 2022-11-23T02:48:20.6828956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6829194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6829325Z File "", line 1, in 2022-11-23T02:48:20.6829540Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6829692Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6829882Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6830034Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6830251Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6830360Z self.run() 2022-11-23T02:48:20.6830566Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6830715Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6831137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6831274Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6831630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6831755Z getattr(self, test_name)() 2022-11-23T02:48:20.6832129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6832233Z fn() 2022-11-23T02:48:20.6832612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6832736Z test(self, **param_kwargs) 2022-11-23T02:48:20.6833106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6833236Z return func(*args, **kwargs) 2022-11-23T02:48:20.6833471Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6833588Z self.run_subtests( 2022-11-23T02:48:20.6833952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6834121Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6834554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6834725Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6835367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6835498Z output = model(*input) 2022-11-23T02:48:20.6835828Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6835971Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6836361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6836545Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6836919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6837050Z _lazy_init(state, module) 2022-11-23T02:48:20.6837412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6837560Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6837894Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6838028Z return func(*args, **kwargs) 2022-11-23T02:48:20.6838416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6838527Z p_assert( 2022-11-23T02:48:20.6838873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6839004Z traceback.print_stack() 2022-11-23T02:48:20.6839137Z File "", line 1, in 2022-11-23T02:48:20.6839353Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6839485Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6839692Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6839846Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6840063Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6840169Z self.run() 2022-11-23T02:48:20.6840375Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6840528Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6840960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6841095Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6841467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6841596Z getattr(self, test_name)() 2022-11-23T02:48:20.6841974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6842079Z fn() 2022-11-23T02:48:20.6842454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6842583Z test(self, **param_kwargs) 2022-11-23T02:48:20.6842932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6843064Z return func(*args, **kwargs) 2022-11-23T02:48:20.6843316Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6843437Z self.run_subtests( 2022-11-23T02:48:20.6843803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6843970Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6844414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6844585Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6844961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6845084Z output = model(*input) 2022-11-23T02:48:20.6845424Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6845571Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6845971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6846156Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6846533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6846665Z _lazy_init(state, module) 2022-11-23T02:48:20.6847009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6847160Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6847509Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6847641Z return func(*args, **kwargs) 2022-11-23T02:48:20.6848033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6848150Z p_assert( 2022-11-23T02:48:20.6848500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6848634Z traceback.print_stack() 2022-11-23T02:48:20.6848861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6849107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6849245Z File "", line 1, in 2022-11-23T02:48:20.6849461Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6849607Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6849813Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6849972Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6850190Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6850343Z self.run() 2022-11-23T02:48:20.6850551Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6850704Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6851061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6851204Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6851578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6851709Z getattr(self, test_name)() 2022-11-23T02:48:20.6852064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6852169Z fn() 2022-11-23T02:48:20.6852544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6852675Z test(self, **param_kwargs) 2022-11-23T02:48:20.6853046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6853175Z return func(*args, **kwargs) 2022-11-23T02:48:20.6853420Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6853538Z self.run_subtests( 2022-11-23T02:48:20.6853938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6854120Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6854491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6854649Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6855037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6855170Z output = model(*input) 2022-11-23T02:48:20.6855509Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6855657Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6856028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6856216Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6856596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6856724Z _lazy_init(state, module) 2022-11-23T02:48:20.6857091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6857240Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6857589Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6857724Z return func(*args, **kwargs) 2022-11-23T02:48:20.6858098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6858209Z p_assert( 2022-11-23T02:48:20.6858559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6858694Z traceback.print_stack() 2022-11-23T02:48:20.6858826Z File "", line 1, in 2022-11-23T02:48:20.6859041Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6859187Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6859395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6859532Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6859752Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6859924Z self.run() 2022-11-23T02:48:20.6860134Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6860283Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6860637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6860780Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6861152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6861262Z getattr(self, test_name)() 2022-11-23T02:48:20.6861634Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6861738Z fn() 2022-11-23T02:48:20.6862118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6862251Z test(self, **param_kwargs) 2022-11-23T02:48:20.6862621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6862757Z return func(*args, **kwargs) 2022-11-23T02:48:20.6862986Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6863103Z self.run_subtests( 2022-11-23T02:48:20.6863518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6863695Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6864070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6864228Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6864615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6864748Z output = model(*input) 2022-11-23T02:48:20.6865067Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6865215Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6865607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6865794Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6866171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6874407Z _lazy_init(state, module) 2022-11-23T02:48:20.6874863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6875319Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6875723Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6875868Z return func(*args, **kwargs) 2022-11-23T02:48:20.6876269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6876378Z p_assert( 2022-11-23T02:48:20.6876731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6876866Z traceback.print_stack() 2022-11-23T02:48:20.6877094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6877333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6877468Z File "", line 1, in 2022-11-23T02:48:20.6877682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6877828Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6878196Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6878353Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6878553Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6878663Z self.run() 2022-11-23T02:48:20.6878871Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6879025Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6879390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6879531Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6879908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6880037Z getattr(self, test_name)() 2022-11-23T02:48:20.6880390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6880501Z fn() 2022-11-23T02:48:20.6880877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6881004Z test(self, **param_kwargs) 2022-11-23T02:48:20.6881374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6881581Z return func(*args, **kwargs) 2022-11-23T02:48:20.6881843Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6881963Z self.run_subtests( 2022-11-23T02:48:20.6882313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6882482Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6882855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6883022Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6883413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6883537Z output = model(*input) 2022-11-23T02:48:20.6883876Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6884021Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6884393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6884576Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6884950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6885076Z _lazy_init(state, module) 2022-11-23T02:48:20.6885434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6885590Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6885940Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6886070Z return func(*args, **kwargs) 2022-11-23T02:48:20.6886448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6886555Z p_assert( 2022-11-23T02:48:20.6886902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6887033Z traceback.print_stack() 2022-11-23T02:48:20.6887163Z File "", line 1, in 2022-11-23T02:48:20.6887379Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6887525Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6887808Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6887947Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6888162Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6888269Z self.run() 2022-11-23T02:48:20.6888475Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6888629Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6888979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6889117Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6889470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6889600Z getattr(self, test_name)() 2022-11-23T02:48:20.6889970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6890076Z fn() 2022-11-23T02:48:20.6890450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6890578Z test(self, **param_kwargs) 2022-11-23T02:48:20.6890942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6891124Z return func(*args, **kwargs) 2022-11-23T02:48:20.6891361Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6891479Z self.run_subtests( 2022-11-23T02:48:20.6891846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6892014Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6892389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6892552Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6892939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6893063Z output = model(*input) 2022-11-23T02:48:20.6893383Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6893528Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6893913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6894098Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6894475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6894601Z _lazy_init(state, module) 2022-11-23T02:48:20.6894958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6895112Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6895444Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6895573Z return func(*args, **kwargs) 2022-11-23T02:48:20.6895966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6896072Z p_assert( 2022-11-23T02:48:20.6896421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6896553Z traceback.print_stack() 2022-11-23T02:48:20.6896794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6897031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6897230Z File "", line 1, in 2022-11-23T02:48:20.6897446Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6897588Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6897795Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6897952Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6898175Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6898280Z self.run() 2022-11-23T02:48:20.6898483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6898617Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6898970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6899096Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6899464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6899597Z getattr(self, test_name)() 2022-11-23T02:48:20.6899965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6900061Z fn() 2022-11-23T02:48:20.6900477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6900605Z test(self, **param_kwargs) 2022-11-23T02:48:20.6900964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6901081Z return func(*args, **kwargs) 2022-11-23T02:48:20.6901315Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6901428Z self.run_subtests( 2022-11-23T02:48:20.6901793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6901966Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6902324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6902482Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6902873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6902996Z output = model(*input) 2022-11-23T02:48:20.6903334Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6903480Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6903862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6904045Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6904432Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6904540Z _lazy_init(state, module) 2022-11-23T02:48:20.6904902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6905050Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6905401Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6905531Z return func(*args, **kwargs) 2022-11-23T02:48:20.6905921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6906028Z p_assert( 2022-11-23T02:48:20.6906353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6906483Z traceback.print_stack() 2022-11-23T02:48:20.6906680Z File "", line 1, in 2022-11-23T02:48:20.6906894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6907041Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6907248Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6907403Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6907625Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6907714Z self.run() 2022-11-23T02:48:20.6907921Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6908069Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6908424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6908561Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6908933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6909065Z getattr(self, test_name)() 2022-11-23T02:48:20.6909432Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6909516Z fn() 2022-11-23T02:48:20.6910000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6910140Z test(self, **param_kwargs) 2022-11-23T02:48:20.6910513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6910644Z return func(*args, **kwargs) 2022-11-23T02:48:20.6910888Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6911006Z self.run_subtests( 2022-11-23T02:48:20.6911351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6911524Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6911900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6912060Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6912450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6912576Z output = model(*input) 2022-11-23T02:48:20.6912910Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6913058Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6913443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6913609Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6913988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6914115Z _lazy_init(state, module) 2022-11-23T02:48:20.6914475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6914622Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6914975Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6915417Z return func(*args, **kwargs) 2022-11-23T02:48:20.6915829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6915920Z p_assert( 2022-11-23T02:48:20.6916266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6916396Z traceback.print_stack() 2022-11-23T02:48:20.6916736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6916976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6917109Z File "", line 1, in 2022-11-23T02:48:20.6917323Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6917455Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6917663Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6917816Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6918030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6918138Z self.run() 2022-11-23T02:48:20.6918342Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6918492Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6918854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6918974Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6919345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6919472Z getattr(self, test_name)() 2022-11-23T02:48:20.6919911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6920025Z fn() 2022-11-23T02:48:20.6920406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6920534Z test(self, **param_kwargs) 2022-11-23T02:48:20.6920900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6921011Z return func(*args, **kwargs) 2022-11-23T02:48:20.6921254Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6921376Z self.run_subtests( 2022-11-23T02:48:20.6921741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6921907Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6922287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6922445Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6922831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6922938Z output = model(*input) 2022-11-23T02:48:20.6923267Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6923411Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6923804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6923986Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6924362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6924495Z _lazy_init(state, module) 2022-11-23T02:48:20.6924855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6924986Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6925331Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6925459Z return func(*args, **kwargs) 2022-11-23T02:48:20.6925849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6926022Z p_assert( 2022-11-23T02:48:20.6926372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6926504Z traceback.print_stack() 2022-11-23T02:48:20.6926636Z File "", line 1, in 2022-11-23T02:48:20.6926832Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6926982Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6927190Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6927343Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6927559Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6927666Z self.run() 2022-11-23T02:48:20.6927871Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6928003Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6928360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6928498Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6928872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6928998Z getattr(self, test_name)() 2022-11-23T02:48:20.6929423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6929538Z fn() 2022-11-23T02:48:20.6929918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6930029Z test(self, **param_kwargs) 2022-11-23T02:48:20.6930392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6930521Z return func(*args, **kwargs) 2022-11-23T02:48:20.6930770Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6930887Z self.run_subtests( 2022-11-23T02:48:20.6931247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6931414Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6931789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6931932Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6932315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6932440Z output = model(*input) 2022-11-23T02:48:20.6932772Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6932920Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6933311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6933495Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6933871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6933983Z _lazy_init(state, module) 2022-11-23T02:48:20.6934344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6934494Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6934843Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6934972Z return func(*args, **kwargs) 2022-11-23T02:48:20.6935358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6935535Z p_assert( 2022-11-23T02:48:20.6935885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6935998Z traceback.print_stack() 2022-11-23T02:48:20.6936241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6936489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6936623Z File "", line 1, in 2022-11-23T02:48:20.6936838Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6936983Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6937187Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6937340Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6937540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6937650Z self.run() 2022-11-23T02:48:20.6937853Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6938001Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6938354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6938492Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6938916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6939035Z getattr(self, test_name)() 2022-11-23T02:48:20.6939408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6939511Z fn() 2022-11-23T02:48:20.6939882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6940010Z test(self, **param_kwargs) 2022-11-23T02:48:20.6940381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6940513Z return func(*args, **kwargs) 2022-11-23T02:48:20.6940752Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6940852Z self.run_subtests( 2022-11-23T02:48:20.6941218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6941383Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6941757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6941915Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6942306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6942435Z output = model(*input) 2022-11-23T02:48:20.6942769Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6942897Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6943281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6943467Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6943844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6943969Z _lazy_init(state, module) 2022-11-23T02:48:20.6944330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6944476Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6944817Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6944997Z return func(*args, **kwargs) 2022-11-23T02:48:20.6945396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6945504Z p_assert( 2022-11-23T02:48:20.6945850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6945985Z traceback.print_stack() 2022-11-23T02:48:20.6946118Z File "", line 1, in 2022-11-23T02:48:20.6946332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6946480Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6946669Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6946821Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6947038Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6947155Z self.run() 2022-11-23T02:48:20.6947360Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6947508Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6947853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6948047Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6948415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6948542Z getattr(self, test_name)() 2022-11-23T02:48:20.6948906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6949004Z fn() 2022-11-23T02:48:20.6949379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6949506Z test(self, **param_kwargs) 2022-11-23T02:48:20.6949877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6950008Z return func(*args, **kwargs) 2022-11-23T02:48:20.6950236Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6950352Z self.run_subtests( 2022-11-23T02:48:20.6950720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6950890Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6951261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6951420Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6951806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6951935Z output = model(*input) 2022-11-23T02:48:20.6952254Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6952398Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6952780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6952966Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6953347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6953475Z _lazy_init(state, module) 2022-11-23T02:48:20.6953833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6953983Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6954311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6954510Z return func(*args, **kwargs) 2022-11-23T02:48:20.6954911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6955285Z p_assert( 2022-11-23T02:48:20.6955657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6955795Z traceback.print_stack() 2022-11-23T02:48:20.6956040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6956282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6956398Z File "", line 1, in 2022-11-23T02:48:20.6956612Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6956756Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6956959Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6957116Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6957332Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6957439Z self.run() 2022-11-23T02:48:20.6957628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6957857Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6958226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6958367Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6958738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6958865Z getattr(self, test_name)() 2022-11-23T02:48:20.6959233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6959343Z fn() 2022-11-23T02:48:20.6959700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6959828Z test(self, **param_kwargs) 2022-11-23T02:48:20.6960196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6960330Z return func(*args, **kwargs) 2022-11-23T02:48:20.6960577Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6960695Z self.run_subtests( 2022-11-23T02:48:20.6961055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6961224Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6961576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6961741Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6962128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6962255Z output = model(*input) 2022-11-23T02:48:20.6962587Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6962736Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6963119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6963299Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6963659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6963787Z _lazy_init(state, module) 2022-11-23T02:48:20.6964148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6964393Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6964744Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6964875Z return func(*args, **kwargs) 2022-11-23T02:48:20.6965271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6965379Z p_assert( 2022-11-23T02:48:20.6965709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6965838Z traceback.print_stack() 2022-11-23T02:48:20.6965971Z File "", line 1, in 2022-11-23T02:48:20.6966184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6966329Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6966535Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6966695Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6966894Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6967003Z self.run() 2022-11-23T02:48:20.6967207Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6967408Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6967767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6967903Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6968276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6968405Z getattr(self, test_name)() 2022-11-23T02:48:20.6968755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6968865Z fn() 2022-11-23T02:48:20.6969239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6969369Z test(self, **param_kwargs) 2022-11-23T02:48:20.6969735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6969870Z return func(*args, **kwargs) 2022-11-23T02:48:20.6970114Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6970229Z self.run_subtests( 2022-11-23T02:48:20.6970575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6970742Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6971110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6971273Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6971660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6971785Z output = model(*input) 2022-11-23T02:48:20.6972126Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6972272Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6972640Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6972821Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6973200Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6973325Z _lazy_init(state, module) 2022-11-23T02:48:20.6973737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6973957Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6974313Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6974444Z return func(*args, **kwargs) 2022-11-23T02:48:20.6974821Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6974931Z p_assert( 2022-11-23T02:48:20.6975277Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6975410Z traceback.print_stack() 2022-11-23T02:48:20.6975650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6975890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6976022Z File "", line 1, in 2022-11-23T02:48:20.6976240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6976367Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6976572Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6976725Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6976993Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6977112Z self.run() 2022-11-23T02:48:20.6977323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6977473Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6977807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6977945Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6978317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6978452Z getattr(self, test_name)() 2022-11-23T02:48:20.6978819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6978924Z fn() 2022-11-23T02:48:20.6979301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6979429Z test(self, **param_kwargs) 2022-11-23T02:48:20.6979776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6979904Z return func(*args, **kwargs) 2022-11-23T02:48:20.6980149Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6980263Z self.run_subtests( 2022-11-23T02:48:20.6980624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6980796Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6981168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6981324Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6981696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6981821Z output = model(*input) 2022-11-23T02:48:20.6982158Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6982305Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6982691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6982871Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6983317Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6983442Z _lazy_init(state, module) 2022-11-23T02:48:20.6983785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6983933Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6984282Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6984412Z return func(*args, **kwargs) 2022-11-23T02:48:20.6984799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6984905Z p_assert( 2022-11-23T02:48:20.6985247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6985379Z traceback.print_stack() 2022-11-23T02:48:20.6985492Z File "", line 1, in 2022-11-23T02:48:20.6985708Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6985856Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6986062Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6986213Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6986477Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6986597Z self.run() 2022-11-23T02:48:20.6986807Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6986938Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6987291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6987425Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6987799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6987933Z getattr(self, test_name)() 2022-11-23T02:48:20.6988298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6988402Z fn() 2022-11-23T02:48:20.6988762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6988890Z test(self, **param_kwargs) 2022-11-23T02:48:20.6989256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6989381Z return func(*args, **kwargs) 2022-11-23T02:48:20.6989625Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6989740Z self.run_subtests( 2022-11-23T02:48:20.6990102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.6990276Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.6990630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.6990788Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.6991175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.6991299Z output = model(*input) 2022-11-23T02:48:20.6991628Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.6991773Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.6992153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.6992335Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.6992784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.6992892Z _lazy_init(state, module) 2022-11-23T02:48:20.6993254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.6993401Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.6993749Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.6993878Z return func(*args, **kwargs) 2022-11-23T02:48:20.6994267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.6994373Z p_assert( 2022-11-23T02:48:20.6994721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.6994834Z traceback.print_stack() 2022-11-23T02:48:20.6995324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6995579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.6995715Z File "", line 1, in 2022-11-23T02:48:20.6995928Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.6996158Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.6996378Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.6996515Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.6996737Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.6996844Z self.run() 2022-11-23T02:48:20.6997049Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.6997197Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.6997552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.6997697Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.6998069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.6998179Z getattr(self, test_name)() 2022-11-23T02:48:20.6998549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.6998652Z fn() 2022-11-23T02:48:20.6999023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.6999150Z test(self, **param_kwargs) 2022-11-23T02:48:20.6999509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.6999638Z return func(*args, **kwargs) 2022-11-23T02:48:20.6999880Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.6999985Z self.run_subtests( 2022-11-23T02:48:20.7000349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7000516Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7000895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7001053Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7001436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7001561Z output = model(*input) 2022-11-23T02:48:20.7001894Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7002024Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7002500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7002686Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7003060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7003193Z _lazy_init(state, module) 2022-11-23T02:48:20.7003554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7003700Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7004050Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7004161Z return func(*args, **kwargs) 2022-11-23T02:48:20.7004549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7004658Z p_assert( 2022-11-23T02:48:20.7005004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7005133Z traceback.print_stack() 2022-11-23T02:48:20.7005260Z File "", line 1, in 2022-11-23T02:48:20.7005477Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7005656Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7005874Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7006028Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7006245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7006351Z self.run() 2022-11-23T02:48:20.7006557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7006707Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7007064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7007185Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7007557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7007687Z getattr(self, test_name)() 2022-11-23T02:48:20.7008056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7008159Z fn() 2022-11-23T02:48:20.7008532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7008661Z test(self, **param_kwargs) 2022-11-23T02:48:20.7009023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7009134Z return func(*args, **kwargs) 2022-11-23T02:48:20.7009372Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7009492Z self.run_subtests( 2022-11-23T02:48:20.7009853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7010024Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7010401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7010561Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7010944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7011048Z output = model(*input) 2022-11-23T02:48:20.7011382Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7011530Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7011988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7012172Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7012548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7012681Z _lazy_init(state, module) 2022-11-23T02:48:20.7013036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7013168Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7013515Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7013645Z return func(*args, **kwargs) 2022-11-23T02:48:20.7014033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7014146Z p_assert( 2022-11-23T02:48:20.7014491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7014629Z traceback.print_stack() 2022-11-23T02:48:20.7014871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7015142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7015282Z File "", line 1, in 2022-11-23T02:48:20.7015497Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7015643Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7015848Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7016004Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7016220Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7016336Z self.run() 2022-11-23T02:48:20.7016526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7016668Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7017018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7017154Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7017525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7017653Z getattr(self, test_name)() 2022-11-23T02:48:20.7018020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7018105Z fn() 2022-11-23T02:48:20.7018479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7018603Z test(self, **param_kwargs) 2022-11-23T02:48:20.7018975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7019111Z return func(*args, **kwargs) 2022-11-23T02:48:20.7019352Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7019468Z self.run_subtests( 2022-11-23T02:48:20.7019832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7019984Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7020360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7020515Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7020900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7021087Z output = model(*input) 2022-11-23T02:48:20.7021424Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7021568Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7021954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7022123Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7022502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7022623Z _lazy_init(state, module) 2022-11-23T02:48:20.7022986Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7023132Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7023476Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7023610Z return func(*args, **kwargs) 2022-11-23T02:48:20.7023999Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7024091Z p_assert( 2022-11-23T02:48:20.7024436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7024619Z traceback.print_stack() 2022-11-23T02:48:20.7024763Z File "", line 1, in 2022-11-23T02:48:20.7024980Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7025125Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7025327Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7025483Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7025683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7025792Z self.run() 2022-11-23T02:48:20.7025997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7026143Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7026492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7026630Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7027010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7027139Z getattr(self, test_name)() 2022-11-23T02:48:20.7027486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7027585Z fn() 2022-11-23T02:48:20.7027952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7028079Z test(self, **param_kwargs) 2022-11-23T02:48:20.7028448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7028579Z return func(*args, **kwargs) 2022-11-23T02:48:20.7028820Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7028920Z self.run_subtests( 2022-11-23T02:48:20.7029285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7029454Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7029822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7029979Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7030365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7030571Z output = model(*input) 2022-11-23T02:48:20.7030912Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7031041Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7031429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7031617Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7031993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7032117Z _lazy_init(state, module) 2022-11-23T02:48:20.7032475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7032622Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7032968Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7033102Z return func(*args, **kwargs) 2022-11-23T02:48:20.7033475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7033581Z p_assert( 2022-11-23T02:48:20.7033926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7034110Z traceback.print_stack() 2022-11-23T02:48:20.7034366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7034609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7034744Z File "", line 1, in 2022-11-23T02:48:20.7034941Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7035336Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7035548Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7035709Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7035927Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7036033Z self.run() 2022-11-23T02:48:20.7036240Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7036395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7036739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7036876Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7037250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7037377Z getattr(self, test_name)() 2022-11-23T02:48:20.7037743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7037850Z fn() 2022-11-23T02:48:20.7038218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7038342Z test(self, **param_kwargs) 2022-11-23T02:48:20.7038690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7038824Z return func(*args, **kwargs) 2022-11-23T02:48:20.7039067Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7039184Z self.run_subtests( 2022-11-23T02:48:20.7039544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7039714Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7040087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7040337Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7040710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7040833Z output = model(*input) 2022-11-23T02:48:20.7041166Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7041315Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7041700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7041881Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7042260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7042382Z _lazy_init(state, module) 2022-11-23T02:48:20.7042728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7042879Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7043232Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7043363Z return func(*args, **kwargs) 2022-11-23T02:48:20.7043813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7043937Z p_assert( 2022-11-23T02:48:20.7044282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7044416Z traceback.print_stack() 2022-11-23T02:48:20.7044533Z File "", line 1, in 2022-11-23T02:48:20.7044746Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7044892Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7045097Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7045258Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7045474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7045577Z self.run() 2022-11-23T02:48:20.7045766Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7045920Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7046272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7046414Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7046782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7046909Z getattr(self, test_name)() 2022-11-23T02:48:20.7047269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7047375Z fn() 2022-11-23T02:48:20.7047732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7047862Z test(self, **param_kwargs) 2022-11-23T02:48:20.7048223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7048355Z return func(*args, **kwargs) 2022-11-23T02:48:20.7048602Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7048718Z self.run_subtests( 2022-11-23T02:48:20.7049077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7049244Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7049599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7049820Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7050207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7050332Z output = model(*input) 2022-11-23T02:48:20.7050664Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7050815Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7051198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7051380Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7051739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7051868Z _lazy_init(state, module) 2022-11-23T02:48:20.7052230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7052380Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7052728Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7052858Z return func(*args, **kwargs) 2022-11-23T02:48:20.7053298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7053413Z p_assert( 2022-11-23T02:48:20.7053746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7053876Z traceback.print_stack() 2022-11-23T02:48:20.7054118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7054358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7054597Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7054844Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7055075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7055305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7055525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7055754Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7055983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7056213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7056443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7056676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7056914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7057143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7057356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7057591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7057818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7058046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7058272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7058504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7058857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7059087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7059316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7059534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7059758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7059986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7060216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7060440Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7060665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7060894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7061122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7061334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7061610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7061852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7062079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7062307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7062538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7062763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7062995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7063223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7063435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7063664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7063890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7064117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7064343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7064572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7064799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7065031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7065132Z dist init r=0, world=2 2022-11-23T02:48:20.7065474Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7065806Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7066122Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7066430Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7066806Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7067114Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7067451Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7067770Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7068084Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7068393Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7068689Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7069037Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7069165Z dist init r=1, world=2 2022-11-23T02:48:20.7069481Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7069793Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7070105Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7070416Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7070728Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7071035Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7071341Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7071647Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7071961Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7072252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7072564Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7072871Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7072974Z ok (9.419s) 2022-11-23T02:48:20.7073313Z test_transformer_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91639 2022-11-23T02:48:20.7073647Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91640 2022-11-23T02:48:20.7074047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.7074228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.7074625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.7074807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.7075496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.7075687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.7076094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.7076298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.7076553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.7076800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.7077290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.7077715Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.7077932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.7078164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.7078404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7078648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7079685Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.7079805Z warnings.warn( 2022-11-23T02:48:20.7080828Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.7080949Z warnings.warn( 2022-11-23T02:48:20.7081081Z File "", line 1, in 2022-11-23T02:48:20.7081298Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7081447Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7081644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7081794Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7082011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7082117Z self.run() 2022-11-23T02:48:20.7082323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7082473Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7082825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7083025Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7083406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7083534Z getattr(self, test_name)() 2022-11-23T02:48:20.7083911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7084015Z fn() 2022-11-23T02:48:20.7084387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7084516Z test(self, **param_kwargs) 2022-11-23T02:48:20.7084876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7084990Z return func(*args, **kwargs) 2022-11-23T02:48:20.7085234Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7085360Z self.run_subtests( 2022-11-23T02:48:20.7085725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7085896Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7086321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7086490Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7086881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7086987Z output = model(*input) 2022-11-23T02:48:20.7087314Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7087454Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7087838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7088025Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7088407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7088537Z _lazy_init(state, module) 2022-11-23T02:48:20.7088898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7089030Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7089378Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7089508Z return func(*args, **kwargs) 2022-11-23T02:48:20.7089897Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7090005Z p_assert( 2022-11-23T02:48:20.7090357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7090490Z traceback.print_stack() 2022-11-23T02:48:20.7090621Z File "", line 1, in 2022-11-23T02:48:20.7090818Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7090961Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7091170Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7091324Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7091541Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7091650Z self.run() 2022-11-23T02:48:20.7091856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7091992Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7092341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7092563Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7092939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7093067Z getattr(self, test_name)() 2022-11-23T02:48:20.7093436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7093538Z fn() 2022-11-23T02:48:20.7093909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7094021Z test(self, **param_kwargs) 2022-11-23T02:48:20.7094386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7094517Z return func(*args, **kwargs) 2022-11-23T02:48:20.7094761Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7094882Z self.run_subtests( 2022-11-23T02:48:20.7095245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7095412Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7095840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7095992Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7096383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7096504Z output = model(*input) 2022-11-23T02:48:20.7096837Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7096983Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7097371Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7097560Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7097939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7098048Z _lazy_init(state, module) 2022-11-23T02:48:20.7098413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7098559Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7098904Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7099033Z return func(*args, **kwargs) 2022-11-23T02:48:20.7099414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7099525Z p_assert( 2022-11-23T02:48:20.7099871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7099987Z traceback.print_stack() 2022-11-23T02:48:20.7100230Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7100469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7100607Z File "", line 1, in 2022-11-23T02:48:20.7100823Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7100971Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7101175Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7101327Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7101528Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7101635Z self.run() 2022-11-23T02:48:20.7101905Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7102056Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7102409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7102545Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7102920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7103053Z getattr(self, test_name)() 2022-11-23T02:48:20.7103408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7103507Z fn() 2022-11-23T02:48:20.7103878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7104004Z test(self, **param_kwargs) 2022-11-23T02:48:20.7104372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7104506Z return func(*args, **kwargs) 2022-11-23T02:48:20.7104751Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7104850Z self.run_subtests( 2022-11-23T02:48:20.7105261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7105440Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7105819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7105976Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7106359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7106482Z output = model(*input) 2022-11-23T02:48:20.7106886Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7107015Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7107396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7107573Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7107956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7108081Z _lazy_init(state, module) 2022-11-23T02:48:20.7108443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7108591Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7108939Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7109069Z return func(*args, **kwargs) 2022-11-23T02:48:20.7109449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7109556Z p_assert( 2022-11-23T02:48:20.7109903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7110038Z traceback.print_stack() 2022-11-23T02:48:20.7110175Z File "", line 1, in 2022-11-23T02:48:20.7110387Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7110533Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7110724Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7110878Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7111091Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7111195Z self.run() 2022-11-23T02:48:20.7111472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7111619Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7111972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7112107Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7112460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7112590Z getattr(self, test_name)() 2022-11-23T02:48:20.7112961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7113064Z fn() 2022-11-23T02:48:20.7113430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7113555Z test(self, **param_kwargs) 2022-11-23T02:48:20.7113919Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7114056Z return func(*args, **kwargs) 2022-11-23T02:48:20.7114283Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7114398Z self.run_subtests( 2022-11-23T02:48:20.7114813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7114991Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7115684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7115841Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7116228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7116350Z output = model(*input) 2022-11-23T02:48:20.7116674Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7116819Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7117207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7117393Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7117771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7117896Z _lazy_init(state, module) 2022-11-23T02:48:20.7118257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7118403Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7118733Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7118865Z return func(*args, **kwargs) 2022-11-23T02:48:20.7119257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7119363Z p_assert( 2022-11-23T02:48:20.7119709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7119840Z traceback.print_stack() 2022-11-23T02:48:20.7120088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7120332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7120447Z File "", line 1, in 2022-11-23T02:48:20.7120661Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7120803Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7121009Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7121265Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7121483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7121592Z self.run() 2022-11-23T02:48:20.7121781Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7121930Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7122288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7122425Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7122795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7122922Z getattr(self, test_name)() 2022-11-23T02:48:20.7123289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7123394Z fn() 2022-11-23T02:48:20.7123757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7123886Z test(self, **param_kwargs) 2022-11-23T02:48:20.7124250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7124378Z return func(*args, **kwargs) 2022-11-23T02:48:20.7124692Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7124821Z self.run_subtests( 2022-11-23T02:48:20.7125189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7125358Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7125709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7125869Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7126257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7126381Z output = model(*input) 2022-11-23T02:48:20.7126720Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7126867Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7127257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7127442Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7127805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7127930Z _lazy_init(state, module) 2022-11-23T02:48:20.7128294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7128451Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7128583Z File "", line 1, in 2022-11-23T02:48:20.7128931Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7129061Z return func(*args, **kwargs) 2022-11-23T02:48:20.7129453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7129545Z p_assert( 2022-11-23T02:48:20.7129764Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7129909Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7130258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7130382Z traceback.print_stack() 2022-11-23T02:48:20.7130589Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7130816Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7131018Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7131127Z self.run() 2022-11-23T02:48:20.7131333Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7131485Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7131833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7131972Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7132349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7132477Z getattr(self, test_name)() 2022-11-23T02:48:20.7132831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7132933Z fn() 2022-11-23T02:48:20.7133310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7133437Z test(self, **param_kwargs) 2022-11-23T02:48:20.7133805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7133936Z return func(*args, **kwargs) 2022-11-23T02:48:20.7134233Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7134359Z self.run_subtests( 2022-11-23T02:48:20.7134711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7134881Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7135255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7135413Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7135800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7135923Z output = model(*input) 2022-11-23T02:48:20.7136258Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7136407Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7136779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7136961Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7137335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7137463Z _lazy_init(state, module) 2022-11-23T02:48:20.7137820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7137969Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7138319Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7138448Z return func(*args, **kwargs) 2022-11-23T02:48:20.7138828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7138934Z p_assert( 2022-11-23T02:48:20.7139278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7139410Z traceback.print_stack() 2022-11-23T02:48:20.7139654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7139900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7140036Z File "", line 1, in 2022-11-23T02:48:20.7140256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7140451Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7140661Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7140813Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7141028Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7141138Z self.run() 2022-11-23T02:48:20.7141342Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7141491Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7141828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7141963Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7142332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7142463Z getattr(self, test_name)() 2022-11-23T02:48:20.7142832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7142932Z fn() 2022-11-23T02:48:20.7143302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7143483Z test(self, **param_kwargs) 2022-11-23T02:48:20.7143846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7143978Z return func(*args, **kwargs) 2022-11-23T02:48:20.7144222Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7144341Z self.run_subtests( 2022-11-23T02:48:20.7144706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7144873Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7145250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7145408Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7145776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7145904Z output = model(*input) 2022-11-23T02:48:20.7146238Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7146384Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7146768Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7146945Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7147077Z File "", line 1, in 2022-11-23T02:48:20.7147459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7147568Z _lazy_init(state, module) 2022-11-23T02:48:20.7147928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7148075Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7148295Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7148439Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7148787Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7148915Z return func(*args, **kwargs) 2022-11-23T02:48:20.7149119Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7149257Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7149646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7149819Z p_assert( 2022-11-23T02:48:20.7150035Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7150147Z self.run() 2022-11-23T02:48:20.7150496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7150631Z traceback.print_stack() 2022-11-23T02:48:20.7150846Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7150978Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7151325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7151458Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7151828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7151956Z getattr(self, test_name)() 2022-11-23T02:48:20.7152321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7152425Z fn() 2022-11-23T02:48:20.7152799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7152962Z test(self, **param_kwargs) 2022-11-23T02:48:20.7153340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7153468Z return func(*args, **kwargs) 2022-11-23T02:48:20.7153710Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7153828Z self.run_subtests( 2022-11-23T02:48:20.7154190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7154357Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7154733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7154874Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7155454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7155590Z output = model(*input) 2022-11-23T02:48:20.7155931Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7156075Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7156459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7156639Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7157015Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7157128Z _lazy_init(state, module) 2022-11-23T02:48:20.7157488Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7157636Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7157984Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7158115Z return func(*args, **kwargs) 2022-11-23T02:48:20.7158498Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7158604Z p_assert( 2022-11-23T02:48:20.7158950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7159064Z traceback.print_stack() 2022-11-23T02:48:20.7159303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7159652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7159788Z File "", line 1, in 2022-11-23T02:48:20.7159918Z File "", line 1, in 2022-11-23T02:48:20.7160132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7160282Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7160471Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7160621Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7160828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7160967Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7161186Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7161293Z self.run() 2022-11-23T02:48:20.7161495Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7161653Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7161842Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7161988Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7162268Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7162390Z self.run() 2022-11-23T02:48:20.7162750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7162889Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7163096Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7163228Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7163603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7163735Z getattr(self, test_name)() 2022-11-23T02:48:20.7164080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7164217Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7164584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7164687Z fn() 2022-11-23T02:48:20.7165057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7165168Z getattr(self, test_name)() 2022-11-23T02:48:20.7165541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7165666Z test(self, **param_kwargs) 2022-11-23T02:48:20.7166030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7166133Z fn() 2022-11-23T02:48:20.7166497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7166629Z return func(*args, **kwargs) 2022-11-23T02:48:20.7167001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7167110Z test(self, **param_kwargs) 2022-11-23T02:48:20.7167359Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7167477Z self.run_subtests( 2022-11-23T02:48:20.7167842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7167969Z return func(*args, **kwargs) 2022-11-23T02:48:20.7168329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7168498Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7168809Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7168909Z self.run_subtests( 2022-11-23T02:48:20.7169289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7169448Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7169802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7169970Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7170354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7170475Z output = model(*input) 2022-11-23T02:48:20.7170844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7170989Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7171325Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7171470Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7171911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7172041Z output = model(*input) 2022-11-23T02:48:20.7172430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7172612Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7172946Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7173075Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7173453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7173625Z _lazy_init(state, module) 2022-11-23T02:48:20.7174022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7174203Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7174570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7174722Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7175093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7175201Z _lazy_init(state, module) 2022-11-23T02:48:20.7175544Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7175669Z return func(*args, **kwargs) 2022-11-23T02:48:20.7176031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7176176Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7176563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7176672Z p_assert( 2022-11-23T02:48:20.7177021Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7177132Z return func(*args, **kwargs) 2022-11-23T02:48:20.7177479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7177609Z traceback.print_stack() 2022-11-23T02:48:20.7177995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7178097Z p_assert( 2022-11-23T02:48:20.7178436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7178643Z traceback.print_stack() 2022-11-23T02:48:20.7178884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7179108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7179244Z File "", line 1, in 2022-11-23T02:48:20.7179459Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7179603Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7179804Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7179958Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7180173Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7180277Z self.run() 2022-11-23T02:48:20.7180468Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7180622Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7180974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7181111Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7181535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7181673Z getattr(self, test_name)() 2022-11-23T02:48:20.7182044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7182129Z fn() 2022-11-23T02:48:20.7182501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7182629Z test(self, **param_kwargs) 2022-11-23T02:48:20.7182989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7183123Z return func(*args, **kwargs) 2022-11-23T02:48:20.7183372Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7183494Z self.run_subtests( 2022-11-23T02:48:20.7183857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7184005Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7184376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7184532Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7184915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7185038Z output = model(*input) 2022-11-23T02:48:20.7185368Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7185520Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7185907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7186072Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7186449Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7186572Z _lazy_init(state, module) 2022-11-23T02:48:20.7186934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7187083Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7187430Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7187557Z return func(*args, **kwargs) 2022-11-23T02:48:20.7188014Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7188104Z p_assert( 2022-11-23T02:48:20.7188448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7188579Z traceback.print_stack() 2022-11-23T02:48:20.7188715Z File "", line 1, in 2022-11-23T02:48:20.7188930Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7189075Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7189280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7189434Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7189635Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7189741Z self.run() 2022-11-23T02:48:20.7189952Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7190098Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7190448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7190585Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7191007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7191144Z getattr(self, test_name)() 2022-11-23T02:48:20.7191496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7191596Z fn() 2022-11-23T02:48:20.7191965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7192089Z test(self, **param_kwargs) 2022-11-23T02:48:20.7192453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7192586Z return func(*args, **kwargs) 2022-11-23T02:48:20.7192828Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7192931Z self.run_subtests( 2022-11-23T02:48:20.7193298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7193464Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7193833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7193987Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7194373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7194498Z output = model(*input) 2022-11-23T02:48:20.7194835Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7194964Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7195540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7195724Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7196103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7196230Z _lazy_init(state, module) 2022-11-23T02:48:20.7196592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7196736Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7197081Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7197210Z return func(*args, **kwargs) 2022-11-23T02:48:20.7197675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7197781Z p_assert( 2022-11-23T02:48:20.7198127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7198259Z traceback.print_stack() 2022-11-23T02:48:20.7198507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7198744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7198870Z File "", line 1, in 2022-11-23T02:48:20.7199068Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7199211Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7199412Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7199571Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7199786Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7199892Z self.run() 2022-11-23T02:48:20.7200097Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7200248Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7200696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7200848Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7201227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7201356Z getattr(self, test_name)() 2022-11-23T02:48:20.7201728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7201829Z fn() 2022-11-23T02:48:20.7202204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7202335Z test(self, **param_kwargs) 2022-11-23T02:48:20.7202690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7202818Z return func(*args, **kwargs) 2022-11-23T02:48:20.7203066Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7203180Z self.run_subtests( 2022-11-23T02:48:20.7203540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7203707Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7204077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7204236Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7204609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7204733Z output = model(*input) 2022-11-23T02:48:20.7205067Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7205211Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7205604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7205787Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7206164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7206294Z _lazy_init(state, module) 2022-11-23T02:48:20.7206642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7206854Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7207205Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7207336Z return func(*args, **kwargs) 2022-11-23T02:48:20.7207724Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7207836Z p_assert( 2022-11-23T02:48:20.7208181Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7208310Z traceback.print_stack() 2022-11-23T02:48:20.7208424Z File "", line 1, in 2022-11-23T02:48:20.7208636Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7208782Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7208987Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7209146Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7209359Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7209460Z self.run() 2022-11-23T02:48:20.7209648Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7209796Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7210199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7210347Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7210723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7210849Z getattr(self, test_name)() 2022-11-23T02:48:20.7211217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7211324Z fn() 2022-11-23T02:48:20.7211687Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7211818Z test(self, **param_kwargs) 2022-11-23T02:48:20.7212180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7212311Z return func(*args, **kwargs) 2022-11-23T02:48:20.7212558Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7212677Z self.run_subtests( 2022-11-23T02:48:20.7213041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7213208Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7213566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7213724Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7214110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7214234Z output = model(*input) 2022-11-23T02:48:20.7214569Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7214712Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7215101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7215282Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7215648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7215773Z _lazy_init(state, module) 2022-11-23T02:48:20.7216132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7216344Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7216696Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7216827Z return func(*args, **kwargs) 2022-11-23T02:48:20.7217213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7217320Z p_assert( 2022-11-23T02:48:20.7217650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7217780Z traceback.print_stack() 2022-11-23T02:48:20.7218023Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7218261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7218393Z File "", line 1, in 2022-11-23T02:48:20.7218607Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7218757Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7218964Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7219102Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7219314Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7219465Z self.run() 2022-11-23T02:48:20.7219685Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7219836Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7220196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7220335Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7220688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7220825Z getattr(self, test_name)() 2022-11-23T02:48:20.7221193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7221296Z fn() 2022-11-23T02:48:20.7221668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7221794Z test(self, **param_kwargs) 2022-11-23T02:48:20.7222161Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7222292Z return func(*args, **kwargs) 2022-11-23T02:48:20.7222521Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7222636Z self.run_subtests( 2022-11-23T02:48:20.7222999Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7223165Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7223541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7223698Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7224080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7224207Z output = model(*input) 2022-11-23T02:48:20.7224526Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7224674Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7225052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7225232Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7225610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7225816Z _lazy_init(state, module) 2022-11-23T02:48:20.7226178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7226322Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7226657Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7226784Z return func(*args, **kwargs) 2022-11-23T02:48:20.7227170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7227278Z p_assert( 2022-11-23T02:48:20.7227625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7227755Z traceback.print_stack() 2022-11-23T02:48:20.7227886Z File "", line 1, in 2022-11-23T02:48:20.7228097Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7228228Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7228432Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7228583Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7228798Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7228967Z self.run() 2022-11-23T02:48:20.7229186Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7229338Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7229693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7229812Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7230183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7230312Z getattr(self, test_name)() 2022-11-23T02:48:20.7230685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7230787Z fn() 2022-11-23T02:48:20.7231156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7231283Z test(self, **param_kwargs) 2022-11-23T02:48:20.7231638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7231767Z return func(*args, **kwargs) 2022-11-23T02:48:20.7232008Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7232122Z self.run_subtests( 2022-11-23T02:48:20.7232482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7232649Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7233020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7233176Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7233543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7233670Z output = model(*input) 2022-11-23T02:48:20.7234003Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7234145Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7234527Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7234705Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7235252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7235480Z _lazy_init(state, module) 2022-11-23T02:48:20.7235860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7235991Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7236343Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7236472Z return func(*args, **kwargs) 2022-11-23T02:48:20.7236860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7236966Z p_assert( 2022-11-23T02:48:20.7237309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7237436Z traceback.print_stack() 2022-11-23T02:48:20.7237661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7237904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7238034Z File "", line 1, in 2022-11-23T02:48:20.7238250Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7238393Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7238664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7238833Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7239055Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7239147Z self.run() 2022-11-23T02:48:20.7239279Z File "", line 1, in 2022-11-23T02:48:20.7239486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7239637Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7239992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7240132Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7240343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7240486Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7240846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7240973Z getattr(self, test_name)() 2022-11-23T02:48:20.7241177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7241331Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7241702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7241804Z fn() 2022-11-23T02:48:20.7242021Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7242118Z self.run() 2022-11-23T02:48:20.7242495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7242618Z test(self, **param_kwargs) 2022-11-23T02:48:20.7242823Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7242970Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7243340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7243467Z return func(*args, **kwargs) 2022-11-23T02:48:20.7243812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7243932Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7244176Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7244288Z self.run_subtests( 2022-11-23T02:48:20.7244726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7244855Z getattr(self, test_name)() 2022-11-23T02:48:20.7245218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7245388Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7245755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7245839Z fn() 2022-11-23T02:48:20.7246215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7246372Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7246744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7246876Z test(self, **param_kwargs) 2022-11-23T02:48:20.7247260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7247384Z output = model(*input) 2022-11-23T02:48:20.7247748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7247908Z return func(*args, **kwargs) 2022-11-23T02:48:20.7248253Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7248399Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7248783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7248960Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7249205Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7249326Z self.run_subtests( 2022-11-23T02:48:20.7249704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7249813Z _lazy_init(state, module) 2022-11-23T02:48:20.7250173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7250325Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7250684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7250852Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7251198Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7251324Z return func(*args, **kwargs) 2022-11-23T02:48:20.7251712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7251811Z p_assert( 2022-11-23T02:48:20.7252184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7252344Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7252689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7252824Z traceback.print_stack() 2022-11-23T02:48:20.7253203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7253325Z output = model(*input) 2022-11-23T02:48:20.7253653Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7253779Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7254167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7254416Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7254793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7254919Z _lazy_init(state, module) 2022-11-23T02:48:20.7255281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7255431Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7255779Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7255892Z return func(*args, **kwargs) 2022-11-23T02:48:20.7256280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7256383Z p_assert( 2022-11-23T02:48:20.7256726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7256862Z traceback.print_stack() 2022-11-23T02:48:20.7257104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7257346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7257526Z File "", line 1, in 2022-11-23T02:48:20.7257736Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7257883Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7258088Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7258246Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7258462Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7258569Z self.run() 2022-11-23T02:48:20.7258772Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7258911Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7259268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7259405Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7259781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7259904Z getattr(self, test_name)() 2022-11-23T02:48:20.7260268Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7260367Z fn() 2022-11-23T02:48:20.7260738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7260848Z test(self, **param_kwargs) 2022-11-23T02:48:20.7261211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7261344Z return func(*args, **kwargs) 2022-11-23T02:48:20.7261587Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7261703Z self.run_subtests( 2022-11-23T02:48:20.7262067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7262228Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7262597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7262737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7263122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7263246Z output = model(*input) 2022-11-23T02:48:20.7263579Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7263786Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7264172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7264354Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7264734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7264842Z _lazy_init(state, module) 2022-11-23T02:48:20.7265198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7265346Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7265689Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7265817Z return func(*args, **kwargs) 2022-11-23T02:48:20.7266207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7266312Z p_assert( 2022-11-23T02:48:20.7266653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7266767Z traceback.print_stack() 2022-11-23T02:48:20.7266945Z File "", line 1, in 2022-11-23T02:48:20.7267170Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7267315Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7267522Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7267676Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7267892Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7267998Z self.run() 2022-11-23T02:48:20.7268187Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7268340Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7268692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7268826Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7269200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7269332Z getattr(self, test_name)() 2022-11-23T02:48:20.7269697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7269782Z fn() 2022-11-23T02:48:20.7270154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7270283Z test(self, **param_kwargs) 2022-11-23T02:48:20.7270646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7270781Z return func(*args, **kwargs) 2022-11-23T02:48:20.7271027Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7271146Z self.run_subtests( 2022-11-23T02:48:20.7271509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7271661Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7272031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7272186Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7272565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7272690Z output = model(*input) 2022-11-23T02:48:20.7273019Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7273229Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7273663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7273834Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7274219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7274346Z _lazy_init(state, module) 2022-11-23T02:48:20.7274702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7274850Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7275373Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7275510Z return func(*args, **kwargs) 2022-11-23T02:48:20.7275911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7276002Z p_assert( 2022-11-23T02:48:20.7276346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7276478Z traceback.print_stack() 2022-11-23T02:48:20.7276801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7277052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7277186Z File "", line 1, in 2022-11-23T02:48:20.7277397Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7277540Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7277729Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7277878Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7278101Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7278207Z self.run() 2022-11-23T02:48:20.7278412Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7278558Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7278916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7279054Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7279408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7279531Z getattr(self, test_name)() 2022-11-23T02:48:20.7279899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7280001Z fn() 2022-11-23T02:48:20.7280370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7280501Z test(self, **param_kwargs) 2022-11-23T02:48:20.7280863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7280991Z return func(*args, **kwargs) 2022-11-23T02:48:20.7281221Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7281335Z self.run_subtests( 2022-11-23T02:48:20.7281696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7281863Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7282234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7282390Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7282863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7282990Z output = model(*input) 2022-11-23T02:48:20.7283307Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7283451Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7283839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7284025Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7284400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7284525Z _lazy_init(state, module) 2022-11-23T02:48:20.7284880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7285028Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7285361Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7285491Z return func(*args, **kwargs) 2022-11-23T02:48:20.7285878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7286034Z p_assert( 2022-11-23T02:48:20.7286393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7286525Z traceback.print_stack() 2022-11-23T02:48:20.7286654Z File "", line 1, in 2022-11-23T02:48:20.7286851Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7286996Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7287202Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7287363Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7287580Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7287688Z self.run() 2022-11-23T02:48:20.7287898Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7288048Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7288384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7288519Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7288892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7289019Z getattr(self, test_name)() 2022-11-23T02:48:20.7289382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7289483Z fn() 2022-11-23T02:48:20.7289852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7289985Z test(self, **param_kwargs) 2022-11-23T02:48:20.7290335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7290462Z return func(*args, **kwargs) 2022-11-23T02:48:20.7290705Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7290821Z self.run_subtests( 2022-11-23T02:48:20.7291184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7291352Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7291728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7291885Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7292336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7292462Z output = model(*input) 2022-11-23T02:48:20.7292797Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7292944Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7293334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7293512Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7293886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7294010Z _lazy_init(state, module) 2022-11-23T02:48:20.7294352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7294498Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7294843Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7294970Z return func(*args, **kwargs) 2022-11-23T02:48:20.7295362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7295518Z p_assert( 2022-11-23T02:48:20.7295873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7296001Z traceback.print_stack() 2022-11-23T02:48:20.7296225Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7296465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7296597Z File "", line 1, in 2022-11-23T02:48:20.7296809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7296958Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7297164Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7297315Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7297517Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7297625Z self.run() 2022-11-23T02:48:20.7297830Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7297977Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7298325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7298463Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7298835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7298961Z getattr(self, test_name)() 2022-11-23T02:48:20.7299314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7299414Z fn() 2022-11-23T02:48:20.7299539Z File "", line 1, in 2022-11-23T02:48:20.7299912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7300041Z test(self, **param_kwargs) 2022-11-23T02:48:20.7300405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7300535Z return func(*args, **kwargs) 2022-11-23T02:48:20.7300748Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7300875Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7301113Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7301296Z self.run_subtests( 2022-11-23T02:48:20.7301502Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7301653Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7302017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7302190Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7302411Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7302501Z self.run() 2022-11-23T02:48:20.7302878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7303035Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7303244Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7303393Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7303778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7303902Z output = model(*input) 2022-11-23T02:48:20.7304231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7304368Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7304753Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7304906Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7305282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7305407Z getattr(self, test_name)() 2022-11-23T02:48:20.7305791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7305971Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7306344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7306429Z fn() 2022-11-23T02:48:20.7306804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7306929Z _lazy_init(state, module) 2022-11-23T02:48:20.7307301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7307427Z test(self, **param_kwargs) 2022-11-23T02:48:20.7307786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7307932Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7308283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7308415Z return func(*args, **kwargs) 2022-11-23T02:48:20.7308759Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7308886Z return func(*args, **kwargs) 2022-11-23T02:48:20.7309123Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7309243Z self.run_subtests( 2022-11-23T02:48:20.7309635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7309742Z p_assert( 2022-11-23T02:48:20.7310084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7310250Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7310592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7310720Z traceback.print_stack() 2022-11-23T02:48:20.7311156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7311314Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7311699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7311826Z output = model(*input) 2022-11-23T02:48:20.7312142Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7312287Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7312673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7312855Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7313230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7313359Z _lazy_init(state, module) 2022-11-23T02:48:20.7313716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7313862Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7314243Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7314383Z return func(*args, **kwargs) 2022-11-23T02:48:20.7314772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7314877Z p_assert( 2022-11-23T02:48:20.7315401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7315542Z traceback.print_stack() 2022-11-23T02:48:20.7315785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7316032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7316256Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7316495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7316730Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7316959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7317736Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7318496Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7319243Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7320000Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7320755Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7321601Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7321847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7322087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7322317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7322551Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7322774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7323001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7323225Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7323516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7323761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7323989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7324221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7324447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7325202Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7325960Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7326181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7326414Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7326644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7326879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7327116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7327344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7327575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7327805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7328031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7328242Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7328469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7328694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7329446Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7330268Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7330508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7330744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7330974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7331202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7331430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7331644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7331872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7332151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7332391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7332618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7332841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7333066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7333819Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7334572Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7334807Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7335024Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7335257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7335494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7335723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7335949Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7336065Z dist init r=1, world=2 2022-11-23T02:48:20.7336406Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7336724Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7337037Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7337330Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7337703Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7338015Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7338322Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7338623Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7338928Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7339238Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7339613Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7339944Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7340060Z dist init r=0, world=2 2022-11-23T02:48:20.7340371Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7340679Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7340974Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7341286Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7341589Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7341891Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7342192Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7342503Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7342809Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7343116Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7343421Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7343726Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7343829Z ok (9.619s) 2022-11-23T02:48:20.7344282Z test_transformer_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91722 2022-11-23T02:48:20.7344512Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91723 2022-11-23T02:48:20.7344913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.7345092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.7345485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.7345680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.7346056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:20.7346232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:20.7346620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:20.7346802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:20.7347050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:20.7347346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:20.7347768Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.7348179Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:20.7348416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:20.7348647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:20.7348889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7349128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7350146Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.7350262Z warnings.warn( 2022-11-23T02:48:20.7351289Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:48:20.7351409Z warnings.warn( 2022-11-23T02:48:20.7351542Z File "", line 1, in 2022-11-23T02:48:20.7351765Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7351916Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7352124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7352279Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7352497Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7352586Z self.run() 2022-11-23T02:48:20.7352790Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7353018Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7353372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7353510Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7353882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7354012Z getattr(self, test_name)() 2022-11-23T02:48:20.7354387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7354472Z fn() 2022-11-23T02:48:20.7354846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7354976Z test(self, **param_kwargs) 2022-11-23T02:48:20.7355585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7355723Z return func(*args, **kwargs) 2022-11-23T02:48:20.7355966Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7356081Z self.run_subtests( 2022-11-23T02:48:20.7356427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7356677Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7357071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7357222Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7357606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7357731Z output = model(*input) 2022-11-23T02:48:20.7358066Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7358221Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7358597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7358779Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7359157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7359281Z _lazy_init(state, module) 2022-11-23T02:48:20.7359641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7359786Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7360131Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7360260Z return func(*args, **kwargs) 2022-11-23T02:48:20.7360652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7360746Z p_assert( 2022-11-23T02:48:20.7361092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7361225Z traceback.print_stack() 2022-11-23T02:48:20.7361355Z File "", line 1, in 2022-11-23T02:48:20.7361573Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7361716Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7361920Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7362057Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7362277Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7362380Z self.run() 2022-11-23T02:48:20.7362584Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7362819Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7363176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7363313Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7363684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7363799Z getattr(self, test_name)() 2022-11-23T02:48:20.7364167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7364268Z fn() 2022-11-23T02:48:20.7364640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7364762Z test(self, **param_kwargs) 2022-11-23T02:48:20.7365129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7365266Z return func(*args, **kwargs) 2022-11-23T02:48:20.7365514Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7365612Z self.run_subtests( 2022-11-23T02:48:20.7365975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7366197Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7366579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7366738Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7367120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7367243Z output = model(*input) 2022-11-23T02:48:20.7367576Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7367708Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7368092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7368274Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7368651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7368776Z _lazy_init(state, module) 2022-11-23T02:48:20.7369139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7369285Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7369628Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7369740Z return func(*args, **kwargs) 2022-11-23T02:48:20.7370125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7370236Z p_assert( 2022-11-23T02:48:20.7370582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7370712Z traceback.print_stack() 2022-11-23T02:48:20.7370959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7371201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7371332Z File "", line 1, in 2022-11-23T02:48:20.7371530Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7371672Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7371876Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7372028Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7372312Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7372417Z self.run() 2022-11-23T02:48:20.7372622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7372754Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7373107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7373245Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7373615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7373777Z getattr(self, test_name)() 2022-11-23T02:48:20.7374148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7374250Z fn() 2022-11-23T02:48:20.7374622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7374736Z test(self, **param_kwargs) 2022-11-23T02:48:20.7375099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7375228Z return func(*args, **kwargs) 2022-11-23T02:48:20.7375528Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7375659Z self.run_subtests( 2022-11-23T02:48:20.7376021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7376189Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7376563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7376702Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7377082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7377210Z output = model(*input) 2022-11-23T02:48:20.7377545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7377685Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7378069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7378251Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7378624Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7378731Z _lazy_init(state, module) 2022-11-23T02:48:20.7379086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7379232Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7379577Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7379706Z return func(*args, **kwargs) 2022-11-23T02:48:20.7380093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7380201Z p_assert( 2022-11-23T02:48:20.7380548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7380665Z traceback.print_stack() 2022-11-23T02:48:20.7380795Z File "", line 1, in 2022-11-23T02:48:20.7381011Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7381152Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7381359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7381510Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7381806Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7381895Z self.run() 2022-11-23T02:48:20.7382100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7382249Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7382607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7382744Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7383119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7383245Z getattr(self, test_name)() 2022-11-23T02:48:20.7383613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7383699Z fn() 2022-11-23T02:48:20.7384072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7384205Z test(self, **param_kwargs) 2022-11-23T02:48:20.7384570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7384695Z return func(*args, **kwargs) 2022-11-23T02:48:20.7384990Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7385118Z self.run_subtests( 2022-11-23T02:48:20.7385481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7385630Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7386002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7386160Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7386544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7386669Z output = model(*input) 2022-11-23T02:48:20.7387001Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7387147Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7387532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7387697Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7388066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7388188Z _lazy_init(state, module) 2022-11-23T02:48:20.7388549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7388693Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7389045Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7389174Z return func(*args, **kwargs) 2022-11-23T02:48:20.7389563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7389654Z p_assert( 2022-11-23T02:48:20.7390004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7390128Z traceback.print_stack() 2022-11-23T02:48:20.7390365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7390602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7390731Z File "", line 1, in 2022-11-23T02:48:20.7390945Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7391156Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7391350Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7391497Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7391711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7391816Z self.run() 2022-11-23T02:48:20.7392027Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7392179Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7392526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7392665Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7393024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7393152Z getattr(self, test_name)() 2022-11-23T02:48:20.7393522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7393628Z fn() 2022-11-23T02:48:20.7394003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7394129Z test(self, **param_kwargs) 2022-11-23T02:48:20.7394546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7394669Z return func(*args, **kwargs) 2022-11-23T02:48:20.7394914Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7395196Z self.run_subtests( 2022-11-23T02:48:20.7395594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7395764Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7396139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7396302Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7396690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7396797Z output = model(*input) 2022-11-23T02:48:20.7397131Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7397277Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7397668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7397854Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7398228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7398358Z _lazy_init(state, module) 2022-11-23T02:48:20.7398717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7398865Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7399195Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7399325Z return func(*args, **kwargs) 2022-11-23T02:48:20.7399709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7399818Z p_assert( 2022-11-23T02:48:20.7400164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7400294Z traceback.print_stack() 2022-11-23T02:48:20.7400424Z File "", line 1, in 2022-11-23T02:48:20.7400621Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7400858Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7401066Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7401217Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7401432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7401536Z self.run() 2022-11-23T02:48:20.7401742Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7401887Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7402222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7402355Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7402724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7402848Z getattr(self, test_name)() 2022-11-23T02:48:20.7403212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7403317Z fn() 2022-11-23T02:48:20.7403693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7403820Z test(self, **param_kwargs) 2022-11-23T02:48:20.7404235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7404375Z return func(*args, **kwargs) 2022-11-23T02:48:20.7404615Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7404733Z self.run_subtests( 2022-11-23T02:48:20.7405095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7405259Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7405627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7405790Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7406156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7406281Z output = model(*input) 2022-11-23T02:48:20.7406669Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7406818Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7407210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7407386Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7407758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7407891Z _lazy_init(state, module) 2022-11-23T02:48:20.7408235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7408384Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7408728Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7408859Z return func(*args, **kwargs) 2022-11-23T02:48:20.7409252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7409359Z p_assert( 2022-11-23T02:48:20.7409702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7409830Z traceback.print_stack() 2022-11-23T02:48:20.7410056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7410301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7410502Z File "", line 1, in 2022-11-23T02:48:20.7410723Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7410867Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7411073Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7411231Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7411432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7411538Z self.run() 2022-11-23T02:48:20.7411740Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7411892Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7412244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7412381Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7412756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7412883Z getattr(self, test_name)() 2022-11-23T02:48:20.7413236Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7413336Z fn() 2022-11-23T02:48:20.7413756Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7413892Z test(self, **param_kwargs) 2022-11-23T02:48:20.7414263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7414394Z return func(*args, **kwargs) 2022-11-23T02:48:20.7414636Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7414751Z self.run_subtests( 2022-11-23T02:48:20.7415103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7415269Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7415644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7415802Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7416191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7416318Z output = model(*input) 2022-11-23T02:48:20.7416651Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7416798Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7417165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7417345Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7417720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7417844Z _lazy_init(state, module) 2022-11-23T02:48:20.7418204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7418356Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7418704Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7418834Z return func(*args, **kwargs) 2022-11-23T02:48:20.7419201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7419304Z p_assert( 2022-11-23T02:48:20.7419649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7419852Z traceback.print_stack() 2022-11-23T02:48:20.7419984Z File "", line 1, in 2022-11-23T02:48:20.7420200Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7420349Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7420552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7420696Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7420912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7421019Z self.run() 2022-11-23T02:48:20.7421223Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7421374Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7421723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7421859Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7422215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7422341Z getattr(self, test_name)() 2022-11-23T02:48:20.7422708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7422810Z fn() 2022-11-23T02:48:20.7423231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7423369Z test(self, **param_kwargs) 2022-11-23T02:48:20.7423733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7423859Z return func(*args, **kwargs) 2022-11-23T02:48:20.7424087Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7424199Z self.run_subtests( 2022-11-23T02:48:20.7424564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7424731Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7425096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7425260Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7425649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7425774Z output = model(*input) 2022-11-23T02:48:20.7426093Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7426236Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7426621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7426810Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7427186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7427311Z _lazy_init(state, module) 2022-11-23T02:48:20.7427667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7427817Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7428151Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7428280Z return func(*args, **kwargs) 2022-11-23T02:48:20.7428667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7428772Z p_assert( 2022-11-23T02:48:20.7429118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7429317Z traceback.print_stack() 2022-11-23T02:48:20.7429558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7429800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7429915Z File "", line 1, in 2022-11-23T02:48:20.7430134Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7430278Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7430478Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7430632Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7430846Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7430948Z self.run() 2022-11-23T02:48:20.7431153Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7431288Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7431641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7431776Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7432143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7432322Z getattr(self, test_name)() 2022-11-23T02:48:20.7432704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7432807Z fn() 2022-11-23T02:48:20.7433164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7433292Z test(self, **param_kwargs) 2022-11-23T02:48:20.7433659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7433795Z return func(*args, **kwargs) 2022-11-23T02:48:20.7434038Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7434155Z self.run_subtests( 2022-11-23T02:48:20.7434514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7434686Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7435207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7435378Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7435765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7435891Z output = model(*input) 2022-11-23T02:48:20.7436226Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7436376Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7436760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7436945Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7437314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7437437Z _lazy_init(state, module) 2022-11-23T02:48:20.7437796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7437944Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7438293Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7438424Z return func(*args, **kwargs) 2022-11-23T02:48:20.7438806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7439009Z p_assert( 2022-11-23T02:48:20.7439340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7439471Z traceback.print_stack() 2022-11-23T02:48:20.7439604Z File "", line 1, in 2022-11-23T02:48:20.7439824Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7439970Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7440177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7440332Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7440544Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7440635Z self.run() 2022-11-23T02:48:20.7440839Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7440989Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7441335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7441471Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7441837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7442025Z getattr(self, test_name)() 2022-11-23T02:48:20.7442413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7442497Z fn() 2022-11-23T02:48:20.7442869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7442996Z test(self, **param_kwargs) 2022-11-23T02:48:20.7443359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7443494Z return func(*args, **kwargs) 2022-11-23T02:48:20.7443735Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7443852Z self.run_subtests( 2022-11-23T02:48:20.7444194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7444362Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7444733Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7444888Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7445269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7445392Z output = model(*input) 2022-11-23T02:48:20.7445723Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7445870Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7446253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7446417Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7446794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7446921Z _lazy_init(state, module) 2022-11-23T02:48:20.7447276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7447424Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7447770Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7447896Z return func(*args, **kwargs) 2022-11-23T02:48:20.7448284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7448433Z p_assert( 2022-11-23T02:48:20.7448783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7448909Z traceback.print_stack() 2022-11-23T02:48:20.7449149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7449396Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7449528Z File "", line 1, in 2022-11-23T02:48:20.7449740Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7449868Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7450073Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7450226Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7450442Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7450551Z self.run() 2022-11-23T02:48:20.7450754Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7450899Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7451303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7451434Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7451802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7451930Z getattr(self, test_name)() 2022-11-23T02:48:20.7452293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7452396Z fn() 2022-11-23T02:48:20.7452770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7452903Z test(self, **param_kwargs) 2022-11-23T02:48:20.7453266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7453378Z return func(*args, **kwargs) 2022-11-23T02:48:20.7453618Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7453738Z self.run_subtests( 2022-11-23T02:48:20.7454097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7454265Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7454633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7454788Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7455176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7455287Z output = model(*input) 2022-11-23T02:48:20.7455621Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7455765Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7456156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7456339Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7456717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7456840Z _lazy_init(state, module) 2022-11-23T02:48:20.7457196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7457326Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7457738Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7457867Z return func(*args, **kwargs) 2022-11-23T02:48:20.7458255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7458362Z p_assert( 2022-11-23T02:48:20.7458707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7458837Z traceback.print_stack() 2022-11-23T02:48:20.7458966Z File "", line 1, in 2022-11-23T02:48:20.7459165Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7459303Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7459511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7459664Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7459882Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7459988Z self.run() 2022-11-23T02:48:20.7460191Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7460325Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7460726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7460876Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7461253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7461378Z getattr(self, test_name)() 2022-11-23T02:48:20.7461737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7461838Z fn() 2022-11-23T02:48:20.7462203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7462319Z test(self, **param_kwargs) 2022-11-23T02:48:20.7462685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7462814Z return func(*args, **kwargs) 2022-11-23T02:48:20.7463056Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7463175Z self.run_subtests( 2022-11-23T02:48:20.7463535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7463702Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7464074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7464216Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7464596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7464721Z output = model(*input) 2022-11-23T02:48:20.7465052Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7465198Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7465588Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7465770Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7466148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7466256Z _lazy_init(state, module) 2022-11-23T02:48:20.7466613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7466758Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7467168Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7467295Z return func(*args, **kwargs) 2022-11-23T02:48:20.7467680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7467789Z p_assert( 2022-11-23T02:48:20.7468139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7468255Z traceback.print_stack() 2022-11-23T02:48:20.7468499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7468744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7468873Z File "", line 1, in 2022-11-23T02:48:20.7469087Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7469230Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7469439Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7469595Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7469797Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7469903Z self.run() 2022-11-23T02:48:20.7470158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7470316Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7470666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7470808Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7471175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7471285Z getattr(self, test_name)() 2022-11-23T02:48:20.7471653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7471760Z fn() 2022-11-23T02:48:20.7472134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7472260Z test(self, **param_kwargs) 2022-11-23T02:48:20.7472625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7472756Z return func(*args, **kwargs) 2022-11-23T02:48:20.7472999Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7473099Z self.run_subtests( 2022-11-23T02:48:20.7473458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7473625Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7474040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7474201Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7474581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7474701Z output = model(*input) 2022-11-23T02:48:20.7475197Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7475340Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7475735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7475918Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7476293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7476419Z _lazy_init(state, module) 2022-11-23T02:48:20.7476888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7477034Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7477385Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7477496Z return func(*args, **kwargs) 2022-11-23T02:48:20.7477885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7477990Z p_assert( 2022-11-23T02:48:20.7478328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7478458Z traceback.print_stack() 2022-11-23T02:48:20.7478592Z File "", line 1, in 2022-11-23T02:48:20.7478806Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7478954Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7479141Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7479290Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7479506Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7479612Z self.run() 2022-11-23T02:48:20.7479886Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7480052Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7480400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7480539Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7480889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7481016Z getattr(self, test_name)() 2022-11-23T02:48:20.7481381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7481488Z fn() 2022-11-23T02:48:20.7481861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7481990Z test(self, **param_kwargs) 2022-11-23T02:48:20.7482359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7482473Z return func(*args, **kwargs) 2022-11-23T02:48:20.7482720Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7482834Z self.run_subtests( 2022-11-23T02:48:20.7483190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7483361Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7483730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7483890Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7484277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7484385Z output = model(*input) 2022-11-23T02:48:20.7484720Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7484866Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7485246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7485427Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7485803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7485929Z _lazy_init(state, module) 2022-11-23T02:48:20.7486373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7486523Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7486852Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7486983Z return func(*args, **kwargs) 2022-11-23T02:48:20.7487370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7487484Z p_assert( 2022-11-23T02:48:20.7487833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7487963Z traceback.print_stack() 2022-11-23T02:48:20.7488188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7488429Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7488568Z File "", line 1, in 2022-11-23T02:48:20.7488783Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7488930Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7489134Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7489396Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7489627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7489717Z self.run() 2022-11-23T02:48:20.7489923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7490068Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7490419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7490555Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7490931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7491062Z getattr(self, test_name)() 2022-11-23T02:48:20.7491427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7491512Z fn() 2022-11-23T02:48:20.7491891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7492019Z test(self, **param_kwargs) 2022-11-23T02:48:20.7492381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7492509Z return func(*args, **kwargs) 2022-11-23T02:48:20.7492751Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7492866Z self.run_subtests( 2022-11-23T02:48:20.7493209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7493379Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7493749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7493908Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7494294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7494419Z output = model(*input) 2022-11-23T02:48:20.7494752Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7494896Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7495280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7495446Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7495891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7496019Z _lazy_init(state, module) 2022-11-23T02:48:20.7496376Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7496526Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7496870Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7496998Z return func(*args, **kwargs) 2022-11-23T02:48:20.7497383Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7497473Z p_assert( 2022-11-23T02:48:20.7497822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7497959Z traceback.print_stack() 2022-11-23T02:48:20.7498089Z File "", line 1, in 2022-11-23T02:48:20.7498302Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7498447Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7498652Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7498839Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7499071Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7499177Z self.run() 2022-11-23T02:48:20.7499382Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7499530Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7499878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7500016Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7500390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7500501Z getattr(self, test_name)() 2022-11-23T02:48:20.7500867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7500967Z fn() 2022-11-23T02:48:20.7501344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7501471Z test(self, **param_kwargs) 2022-11-23T02:48:20.7501835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7501965Z return func(*args, **kwargs) 2022-11-23T02:48:20.7502208Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7502308Z self.run_subtests( 2022-11-23T02:48:20.7502680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7502849Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7503223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7503382Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7503770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7503893Z output = model(*input) 2022-11-23T02:48:20.7504228Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7504356Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7504740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7504982Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7505353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7505478Z _lazy_init(state, module) 2022-11-23T02:48:20.7505834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7505991Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7506336Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7506450Z return func(*args, **kwargs) 2022-11-23T02:48:20.7506833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7506940Z p_assert( 2022-11-23T02:48:20.7507282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7507419Z traceback.print_stack() 2022-11-23T02:48:20.7507664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7507901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7508034Z File "", line 1, in 2022-11-23T02:48:20.7508280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7508434Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7508641Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7508791Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7509005Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7509111Z self.run() 2022-11-23T02:48:20.7509316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7509447Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7509807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7509942Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7510308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7510437Z getattr(self, test_name)() 2022-11-23T02:48:20.7510802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7510902Z fn() 2022-11-23T02:48:20.7511272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7511382Z test(self, **param_kwargs) 2022-11-23T02:48:20.7511745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7511873Z return func(*args, **kwargs) 2022-11-23T02:48:20.7512124Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7512242Z self.run_subtests( 2022-11-23T02:48:20.7512603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7512773Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7513143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7513283Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7513665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7513786Z output = model(*input) 2022-11-23T02:48:20.7514120Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7514324Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7514710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7514885Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7515445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7515558Z _lazy_init(state, module) 2022-11-23T02:48:20.7515922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7516069Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7516412Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7516540Z return func(*args, **kwargs) 2022-11-23T02:48:20.7516926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7517038Z p_assert( 2022-11-23T02:48:20.7517382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7517496Z traceback.print_stack() 2022-11-23T02:48:20.7517628Z File "", line 1, in 2022-11-23T02:48:20.7517924Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7518081Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7518290Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7518441Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7518659Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7518763Z self.run() 2022-11-23T02:48:20.7518954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7519107Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7519457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7519590Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7519954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7520082Z getattr(self, test_name)() 2022-11-23T02:48:20.7520445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7520532Z fn() 2022-11-23T02:48:20.7520904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7521027Z test(self, **param_kwargs) 2022-11-23T02:48:20.7521387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7521521Z return func(*args, **kwargs) 2022-11-23T02:48:20.7521765Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7521879Z self.run_subtests( 2022-11-23T02:48:20.7522240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7522395Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7522761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7522914Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7523299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7523424Z output = model(*input) 2022-11-23T02:48:20.7523759Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7523985Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7524373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7524535Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7524914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7525036Z _lazy_init(state, module) 2022-11-23T02:48:20.7525388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7525532Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7525880Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7526008Z return func(*args, **kwargs) 2022-11-23T02:48:20.7526393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7526488Z p_assert( 2022-11-23T02:48:20.7526835Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7526963Z traceback.print_stack() 2022-11-23T02:48:20.7527203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7527492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7527633Z File "", line 1, in 2022-11-23T02:48:20.7527844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7527987Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7528175Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7528323Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7528536Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7528638Z self.run() 2022-11-23T02:48:20.7528843Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7528987Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7529328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7529463Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7529817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7529941Z getattr(self, test_name)() 2022-11-23T02:48:20.7530307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7530407Z fn() 2022-11-23T02:48:20.7530783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7530913Z test(self, **param_kwargs) 2022-11-23T02:48:20.7531280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7531405Z return func(*args, **kwargs) 2022-11-23T02:48:20.7531630Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7531747Z self.run_subtests( 2022-11-23T02:48:20.7532108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7532276Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7532648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7532799Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7533190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7533381Z output = model(*input) 2022-11-23T02:48:20.7533705Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7533844Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7534226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7534403Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7534774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7534899Z _lazy_init(state, module) 2022-11-23T02:48:20.7535253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7535395Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7535726Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7535858Z return func(*args, **kwargs) 2022-11-23T02:48:20.7536241Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7536345Z p_assert( 2022-11-23T02:48:20.7536743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7536879Z traceback.print_stack() 2022-11-23T02:48:20.7537006Z File "", line 1, in 2022-11-23T02:48:20.7537206Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7537348Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7537554Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7537707Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7537922Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7538029Z self.run() 2022-11-23T02:48:20.7538231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7538380Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7538712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7538849Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7539219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7539342Z getattr(self, test_name)() 2022-11-23T02:48:20.7539706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7539807Z fn() 2022-11-23T02:48:20.7540174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7540307Z test(self, **param_kwargs) 2022-11-23T02:48:20.7540658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7540787Z return func(*args, **kwargs) 2022-11-23T02:48:20.7541025Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7541142Z self.run_subtests( 2022-11-23T02:48:20.7541498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7541664Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7542032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7542187Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7542555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7542742Z output = model(*input) 2022-11-23T02:48:20.7543079Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7543219Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7543607Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7543786Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7544163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7544287Z _lazy_init(state, module) 2022-11-23T02:48:20.7544630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7544773Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7545118Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7545249Z return func(*args, **kwargs) 2022-11-23T02:48:20.7545627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7545723Z p_assert( 2022-11-23T02:48:20.7546108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7546247Z traceback.print_stack() 2022-11-23T02:48:20.7546471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7546707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7546833Z File "", line 1, in 2022-11-23T02:48:20.7547035Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7547171Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7547380Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7547533Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7547734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7547841Z self.run() 2022-11-23T02:48:20.7548045Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7548194Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7548539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7548675Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7549041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7549162Z getattr(self, test_name)() 2022-11-23T02:48:20.7549512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7549617Z fn() 2022-11-23T02:48:20.7549988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7550113Z test(self, **param_kwargs) 2022-11-23T02:48:20.7550481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7550606Z return func(*args, **kwargs) 2022-11-23T02:48:20.7550850Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7550964Z self.run_subtests( 2022-11-23T02:48:20.7551309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7551475Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7551844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7552078Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7552460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7552582Z output = model(*input) 2022-11-23T02:48:20.7552917Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7553060Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7553426Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7553600Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7553969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7554094Z _lazy_init(state, module) 2022-11-23T02:48:20.7554456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7554605Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7554946Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7555246Z return func(*args, **kwargs) 2022-11-23T02:48:20.7555714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7555835Z p_assert( 2022-11-23T02:48:20.7556187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7556313Z traceback.print_stack() 2022-11-23T02:48:20.7556443Z File "", line 1, in 2022-11-23T02:48:20.7556654Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7556800Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7557011Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7557148Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7557359Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7557461Z self.run() 2022-11-23T02:48:20.7557668Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7557818Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7558164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7558296Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7558650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7558772Z getattr(self, test_name)() 2022-11-23T02:48:20.7559136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7559243Z fn() 2022-11-23T02:48:20.7559613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7559737Z test(self, **param_kwargs) 2022-11-23T02:48:20.7560099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7560224Z return func(*args, **kwargs) 2022-11-23T02:48:20.7560451Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7560561Z self.run_subtests( 2022-11-23T02:48:20.7560918Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7561089Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7561459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7561693Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7562081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7562203Z output = model(*input) 2022-11-23T02:48:20.7562520Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7562665Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7563051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7563234Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7563603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7563727Z _lazy_init(state, module) 2022-11-23T02:48:20.7564085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7564228Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7564559Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7564688Z return func(*args, **kwargs) 2022-11-23T02:48:20.7565129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7565241Z p_assert( 2022-11-23T02:48:20.7565580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7565706Z traceback.print_stack() 2022-11-23T02:48:20.7565943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7566181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7566305Z File "", line 1, in 2022-11-23T02:48:20.7566514Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7566657Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7566856Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7567010Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7567231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7567337Z self.run() 2022-11-23T02:48:20.7567546Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7567676Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7568024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7568154Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7568527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7568654Z getattr(self, test_name)() 2022-11-23T02:48:20.7569021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7569125Z fn() 2022-11-23T02:48:20.7569483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7569609Z test(self, **param_kwargs) 2022-11-23T02:48:20.7569973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7570099Z return func(*args, **kwargs) 2022-11-23T02:48:20.7570337Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7570451Z self.run_subtests( 2022-11-23T02:48:20.7570810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7571040Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7571404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7571557Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7571943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7572067Z output = model(*input) 2022-11-23T02:48:20.7572390Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7572526Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7572908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7573086Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7573467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7573577Z _lazy_init(state, module) 2022-11-23T02:48:20.7573985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7574187Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7574545Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7574669Z return func(*args, **kwargs) 2022-11-23T02:48:20.7575053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7575156Z p_assert( 2022-11-23T02:48:20.7575500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7575614Z traceback.print_stack() 2022-11-23T02:48:20.7575751Z File "", line 1, in 2022-11-23T02:48:20.7575966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:48:20.7576110Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:48:20.7576313Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:48:20.7576469Z return self._bootstrap(parent_sentinel) 2022-11-23T02:48:20.7576686Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:48:20.7576778Z self.run() 2022-11-23T02:48:20.7576980Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:48:20.7577128Z self._target(*self._args, **self._kwargs) 2022-11-23T02:48:20.7577471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:48:20.7577606Z self.run_test(test_name, pipe) 2022-11-23T02:48:20.7577973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:48:20.7578097Z getattr(self, test_name)() 2022-11-23T02:48:20.7578468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:48:20.7578553Z fn() 2022-11-23T02:48:20.7578928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:48:20.7579053Z test(self, **param_kwargs) 2022-11-23T02:48:20.7579414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:48:20.7579536Z return func(*args, **kwargs) 2022-11-23T02:48:20.7579777Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:48:20.7579889Z self.run_subtests( 2022-11-23T02:48:20.7580245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:48:20.7580464Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:48:20.7580842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:48:20.7580992Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:48:20.7581376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:48:20.7581502Z output = model(*input) 2022-11-23T02:48:20.7581834Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:48:20.7581973Z return forward_call(*input, **kwargs) 2022-11-23T02:48:20.7582351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:48:20.7582518Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:48:20.7582891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:48:20.7583013Z _lazy_init(state, module) 2022-11-23T02:48:20.7583372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:48:20.7583573Z handle.init_flat_param_attributes() 2022-11-23T02:48:20.7583925Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:48:20.7584049Z return func(*args, **kwargs) 2022-11-23T02:48:20.7584438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:48:20.7584531Z p_assert( 2022-11-23T02:48:20.7584875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:48:20.7585008Z traceback.print_stack() 2022-11-23T02:48:20.7585245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7585480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7585714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7585951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7586188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7586404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7587169Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7587932Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7588173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7588407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7588638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7588869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7589099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7589325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7589619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7589832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7590057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7590284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7590512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7590739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7591495Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7592249Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7592531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7592769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7592997Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7593214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7593442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7593675Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7593908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7594136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7594364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7594591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7594819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7595266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7596034Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7596796Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7597035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7597263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7597487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7597714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7597938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7598264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7598497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7598721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7598939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7599162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7599388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7599620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7600387Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7601197Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:48:20.7601445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7601677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7601904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7602130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7602345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7602575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:20.7602688Z dist init r=0, world=2 2022-11-23T02:48:20.7603026Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7603355Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7603671Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7603974Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7604285Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7604589Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7604905Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7605196Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7605501Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7605800Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7606172Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7606481Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:48:20.7606591Z dist init r=1, world=2 2022-11-23T02:48:20.7606917Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7607232Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7607541Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7607855Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7608218Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7608538Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7608827Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7609133Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7609441Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7609746Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7610053Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7610352Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:48:20.7610447Z ok (9.822s) 2022-11-23T02:48:20.7610472Z 2022-11-23T02:48:20.7610747Z ---------------------------------------------------------------------- 2022-11-23T02:48:20.7610863Z Ran 59 tests in 566.965s 2022-11-23T02:48:20.7610887Z 2022-11-23T02:48:20.7610995Z OK (skipped=5) 2022-11-23T02:48:20.7611014Z 2022-11-23T02:48:20.7611122Z Generating XML reports... 2022-11-23T02:48:20.7611540Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20221123023852.xml 2022-11-23T02:48:20.7611957Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20221123023852.xml 2022-11-23T02:48:20.7612369Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20221123023852.xml 2022-11-23T02:48:20.7612806Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20221123023852.xml 2022-11-23T02:48:20.7612827Z 2022-11-23T02:48:20.7613275Z ##[endgroup] 2022-11-23T02:48:20.7613741Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_core_xt0conde) 2022-11-23T02:48:20.7613840Z 2022-11-23T02:48:20.7613905Z 2022-11-23T02:48:20.7614024Z real 89m26.389s 2022-11-23T02:48:20.7614111Z user 149m7.390s 2022-11-23T02:48:20.7614208Z sys 77m10.636s 2022-11-23T02:48:20.7614320Z + assert_git_not_dirty 2022-11-23T02:48:20.7614568Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 != *rocm* ]] 2022-11-23T02:48:20.7614801Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 != *xla* ]] 2022-11-23T02:48:20.7614968Z ++ git status --porcelain 2022-11-23T02:48:21.5190490Z + git_status= 2022-11-23T02:48:21.5191177Z + [[ -n '' ]] 2022-11-23T02:48:21.5191623Z + [[ linux-bionic-cuda11.7-py3.10-gcc7 == *cuda* ]] 2022-11-23T02:48:21.5191910Z + [[ 3 == 1 ]] 2022-11-23T02:48:21.5192141Z + [[ 3 == 1 ]] 2022-11-23T02:48:21.5262231Z Prepare all required actions 2022-11-23T02:48:21.5262652Z Getting action download info 2022-11-23T02:48:21.7140694Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2022-11-23T02:48:21.9070099Z ##[group]Run ./.github/actions/get-workflow-job-id 2022-11-23T02:48:21.9070403Z with: 2022-11-23T02:48:21.9070893Z github-token: *** 2022-11-23T02:48:21.9071144Z env: 2022-11-23T02:48:21.9071371Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:21.9071644Z GPU_FLAG: --gpus all 2022-11-23T02:48:21.9072025Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:21.9072373Z ##[endgroup] 2022-11-23T02:48:21.9105264Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2022-11-23T02:48:21.9105587Z with: 2022-11-23T02:48:21.9105812Z shell: bash 2022-11-23T02:48:21.9106040Z timeout_minutes: 10 2022-11-23T02:48:21.9106288Z max_attempts: 5 2022-11-23T02:48:21.9106578Z retry_wait_seconds: 30 2022-11-23T02:48:21.9107081Z command: set -eux python3 -m pip install requests==2.26.0 GHA_WORKFLOW_JOB_ID=$(python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}") echo "job-id=${GHA_WORKFLOW_JOB_ID}" >> "${GITHUB_OUTPUT}" 2022-11-23T02:48:21.9107603Z polling_interval_seconds: 1 2022-11-23T02:48:21.9107878Z warning_on_retry: true 2022-11-23T02:48:21.9108143Z continue_on_error: false 2022-11-23T02:48:21.9108393Z env: 2022-11-23T02:48:21.9108631Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:21.9108900Z GPU_FLAG: --gpus all 2022-11-23T02:48:21.9109267Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:21.9109790Z GITHUB_TOKEN: *** 2022-11-23T02:48:21.9110038Z ##[endgroup] 2022-11-23T02:48:21.9781686Z + python3 -m pip install requests==2.26.0 2022-11-23T02:48:22.2709372Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T02:48:22.2938690Z Requirement already satisfied: requests==2.26.0 in /home/ec2-user/.local/lib/python3.7/site-packages (2.26.0) 2022-11-23T02:48:22.3126815Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests==2.26.0) (1.26.12) 2022-11-23T02:48:22.3353335Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests==2.26.0) (2022.9.24) 2022-11-23T02:48:22.3366872Z Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests==2.26.0) (2.0.12) 2022-11-23T02:48:22.3393522Z Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests==2.26.0) (3.4) 2022-11-23T02:48:22.5890415Z ++ python3 .github/scripts/get_workflow_job_id.py 3528293554 i-088dc030290e38a53 2022-11-23T02:48:25.2768297Z + GHA_WORKFLOW_JOB_ID=9655199221 2022-11-23T02:48:25.2770739Z + echo job-id=9655199221 2022-11-23T02:48:25.9801478Z Command completed after 1 attempt(s). 2022-11-23T02:48:25.9958994Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2022-11-23T02:48:25.9959402Z kill "$MONITOR_SCRIPT_PID" 2022-11-23T02:48:25.9974001Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:25.9974507Z env: 2022-11-23T02:48:25.9974776Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:25.9975046Z GPU_FLAG: --gpus all 2022-11-23T02:48:25.9975451Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:25.9975855Z MONITOR_SCRIPT_PID: 16510 2022-11-23T02:48:25.9976113Z ##[endgroup] 2022-11-23T02:48:26.0083024Z Prepare all required actions 2022-11-23T02:48:26.0083440Z Getting action download info 2022-11-23T02:48:26.1656715Z Download action repository 'actions/upload-artifact@v3' (SHA:83fd05a356d7e2593de66fc9913b3002723633cb) 2022-11-23T02:48:26.3138747Z ##[group]Run ./.github/actions/upload-test-artifacts 2022-11-23T02:48:26.3139050Z with: 2022-11-23T02:48:26.3139409Z file-suffix: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221 2022-11-23T02:48:26.3139745Z env: 2022-11-23T02:48:26.3139987Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:26.3140258Z GPU_FLAG: --gpus all 2022-11-23T02:48:26.3140623Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:26.3141002Z ##[endgroup] 2022-11-23T02:48:26.3171646Z ##[group]Run # Remove any previous test jsons if they exist 2022-11-23T02:48:26.3172077Z # Remove any previous test jsons if they exist 2022-11-23T02:48:26.3172396Z rm -f test-jsons-*.zip 2022-11-23T02:48:26.3172773Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2022-11-23T02:48:26.3184696Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:26.3184999Z env: 2022-11-23T02:48:26.3185244Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:26.3185500Z GPU_FLAG: --gpus all 2022-11-23T02:48:26.3185883Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:26.3186372Z FILE_SUFFIX: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221 2022-11-23T02:48:26.3186716Z ##[endgroup] 2022-11-23T02:48:26.3305109Z adding: test/allowlist_for_publicAPI.json (deflated 79%) 2022-11-23T02:48:26.3339460Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2022-11-23T02:48:26.3346682Z adding: test/profiler/profiler_utils_mock_events.json (deflated 87%) 2022-11-23T02:48:26.3347888Z adding: test/.pytorch-slow-tests.json (deflated 73%) 2022-11-23T02:48:26.3359213Z adding: test/.pytorch-disabled-tests.json (deflated 86%) 2022-11-23T02:48:26.3382352Z ##[group]Run # Remove any previous test reports if they exist 2022-11-23T02:48:26.3382749Z # Remove any previous test reports if they exist 2022-11-23T02:48:26.3383080Z rm -f test-reports-*.zip 2022-11-23T02:48:26.3383441Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' -i '*.csv' 2022-11-23T02:48:26.3394886Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:26.3395817Z env: 2022-11-23T02:48:26.3396075Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:26.3396327Z GPU_FLAG: --gpus all 2022-11-23T02:48:26.3396711Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:26.3397209Z FILE_SUFFIX: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221 2022-11-23T02:48:26.3397570Z ##[endgroup] 2022-11-23T02:48:26.3544570Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011903.xml (deflated 41%) 2022-11-23T02:48:26.3545409Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011911.xml (deflated 42%) 2022-11-23T02:48:26.3546220Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011914.xml (deflated 42%) 2022-11-23T02:48:26.3547028Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011921.xml (deflated 42%) 2022-11-23T02:48:26.3547812Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011925.xml (deflated 41%) 2022-11-23T02:48:26.3548765Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011933.xml (deflated 41%) 2022-11-23T02:48:26.3549558Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011941.xml (deflated 40%) 2022-11-23T02:48:26.3550441Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011950.xml (deflated 40%) 2022-11-23T02:48:26.3551253Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123011958.xml (deflated 41%) 2022-11-23T02:48:26.3552021Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012006.xml (deflated 40%) 2022-11-23T02:48:26.3552804Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012014.xml (deflated 40%) 2022-11-23T02:48:26.3553596Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012023.xml (deflated 41%) 2022-11-23T02:48:26.3554386Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012032.xml (deflated 40%) 2022-11-23T02:48:26.3555663Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012040.xml (deflated 42%) 2022-11-23T02:48:26.3556482Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012044.xml (deflated 41%) 2022-11-23T02:48:26.3557265Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012051.xml (deflated 42%) 2022-11-23T02:48:26.3558043Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012055.xml (deflated 42%) 2022-11-23T02:48:26.3558805Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012103.xml (deflated 42%) 2022-11-23T02:48:26.3559606Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012105.xml (deflated 45%) 2022-11-23T02:48:26.3560381Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012108.xml (deflated 47%) 2022-11-23T02:48:26.3561157Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012110.xml (deflated 48%) 2022-11-23T02:48:26.3561920Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012113.xml (deflated 45%) 2022-11-23T02:48:26.3562706Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012115.xml (deflated 41%) 2022-11-23T02:48:26.3563482Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012124.xml (deflated 43%) 2022-11-23T02:48:26.3564271Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012126.xml (deflated 44%) 2022-11-23T02:48:26.3565034Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012129.xml (deflated 43%) 2022-11-23T02:48:26.3565824Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012131.xml (deflated 44%) 2022-11-23T02:48:26.3566604Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012133.xml (deflated 44%) 2022-11-23T02:48:26.3567382Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012136.xml (deflated 40%) 2022-11-23T02:48:26.3568156Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012144.xml (deflated 41%) 2022-11-23T02:48:26.3569031Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012146.xml (deflated 41%) 2022-11-23T02:48:26.3569831Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012149.xml (deflated 41%) 2022-11-23T02:48:26.3570682Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012157.xml (deflated 42%) 2022-11-23T02:48:26.3571477Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012204.xml (deflated 43%) 2022-11-23T02:48:26.3572240Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012207.xml (deflated 43%) 2022-11-23T02:48:26.3573016Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012209.xml (deflated 42%) 2022-11-23T02:48:26.3573839Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012211.xml (deflated 43%) 2022-11-23T02:48:26.3574624Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012214.xml (deflated 40%) 2022-11-23T02:48:26.3575380Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012222.xml (deflated 40%) 2022-11-23T02:48:26.3576243Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012231.xml (deflated 43%) 2022-11-23T02:48:26.3577019Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012234.xml (deflated 41%) 2022-11-23T02:48:26.3577796Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012236.xml (deflated 41%) 2022-11-23T02:48:26.3578558Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012238.xml (deflated 41%) 2022-11-23T02:48:26.3579350Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012241.xml (deflated 41%) 2022-11-23T02:48:26.3580134Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012243.xml (deflated 41%) 2022-11-23T02:48:26.3580923Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012246.xml (deflated 41%) 2022-11-23T02:48:26.3581700Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012248.xml (deflated 41%) 2022-11-23T02:48:26.3582461Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012250.xml (deflated 41%) 2022-11-23T02:48:26.3583245Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012253.xml (deflated 41%) 2022-11-23T02:48:26.3584028Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012255.xml (deflated 41%) 2022-11-23T02:48:26.3584810Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012302.xml (deflated 41%) 2022-11-23T02:48:26.3585573Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012304.xml (deflated 42%) 2022-11-23T02:48:26.3586348Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012307.xml (deflated 41%) 2022-11-23T02:48:26.3587124Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012309.xml (deflated 40%) 2022-11-23T02:48:26.3587917Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012316.xml (deflated 40%) 2022-11-23T02:48:26.3588753Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012324.xml (deflated 41%) 2022-11-23T02:48:26.3589580Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012333.xml (deflated 41%) 2022-11-23T02:48:26.3590364Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012341.xml (deflated 40%) 2022-11-23T02:48:26.3591148Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012350.xml (deflated 42%) 2022-11-23T02:48:26.3591909Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012356.xml (deflated 42%) 2022-11-23T02:48:26.3592686Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012403.xml (deflated 42%) 2022-11-23T02:48:26.3593470Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012410.xml (deflated 42%) 2022-11-23T02:48:26.3594253Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012417.xml (deflated 41%) 2022-11-23T02:48:26.3595413Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012425.xml (deflated 41%) 2022-11-23T02:48:26.3596320Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012433.xml (deflated 41%) 2022-11-23T02:48:26.3597099Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012442.xml (deflated 41%) 2022-11-23T02:48:26.3597888Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012451.xml (deflated 41%) 2022-11-23T02:48:26.3598671Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012459.xml (deflated 41%) 2022-11-23T02:48:26.3599430Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012507.xml (deflated 41%) 2022-11-23T02:48:26.3600214Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012515.xml (deflated 41%) 2022-11-23T02:48:26.3600997Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012524.xml (deflated 40%) 2022-11-23T02:48:26.3601769Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012532.xml (deflated 41%) 2022-11-23T02:48:26.3602530Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012535.xml (deflated 41%) 2022-11-23T02:48:26.3603310Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012537.xml (deflated 41%) 2022-11-23T02:48:26.3604089Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012540.xml (deflated 42%) 2022-11-23T02:48:26.3604870Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012542.xml (deflated 42%) 2022-11-23T02:48:26.3605632Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012544.xml (deflated 42%) 2022-11-23T02:48:26.3606406Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012547.xml (deflated 41%) 2022-11-23T02:48:26.3607184Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012549.xml (deflated 42%) 2022-11-23T02:48:26.3607958Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012552.xml (deflated 42%) 2022-11-23T02:48:26.3608826Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012554.xml (deflated 42%) 2022-11-23T02:48:26.3609683Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012556.xml (deflated 42%) 2022-11-23T02:48:26.3610477Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012559.xml (deflated 42%) 2022-11-23T02:48:26.3611257Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012601.xml (deflated 42%) 2022-11-23T02:48:26.3612017Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012604.xml (deflated 43%) 2022-11-23T02:48:26.3612790Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012606.xml (deflated 42%) 2022-11-23T02:48:26.3613573Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012608.xml (deflated 42%) 2022-11-23T02:48:26.3614348Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012611.xml (deflated 42%) 2022-11-23T02:48:26.3615110Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012613.xml (deflated 42%) 2022-11-23T02:48:26.3615883Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012616.xml (deflated 43%) 2022-11-23T02:48:26.3616650Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012618.xml (deflated 42%) 2022-11-23T02:48:26.3617424Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012620.xml (deflated 42%) 2022-11-23T02:48:26.3618189Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012623.xml (deflated 42%) 2022-11-23T02:48:26.3618972Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012625.xml (deflated 42%) 2022-11-23T02:48:26.3619749Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012628.xml (deflated 42%) 2022-11-23T02:48:26.3620522Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012630.xml (deflated 42%) 2022-11-23T02:48:26.3621302Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012632.xml (deflated 42%) 2022-11-23T02:48:26.3622065Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012635.xml (deflated 43%) 2022-11-23T02:48:26.3622849Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012637.xml (deflated 41%) 2022-11-23T02:48:26.3623643Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012646.xml (deflated 41%) 2022-11-23T02:48:26.3624428Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012653.xml (deflated 42%) 2022-11-23T02:48:26.3625191Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012655.xml (deflated 42%) 2022-11-23T02:48:26.3625967Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012658.xml (deflated 40%) 2022-11-23T02:48:26.3626737Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012707.xml (deflated 42%) 2022-11-23T02:48:26.3627514Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012709.xml (deflated 42%) 2022-11-23T02:48:26.3628338Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012716.xml (deflated 42%) 2022-11-23T02:48:26.3629154Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012718.xml (deflated 42%) 2022-11-23T02:48:26.3629943Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012725.xml (deflated 42%) 2022-11-23T02:48:26.3630717Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012727.xml (deflated 43%) 2022-11-23T02:48:26.3631479Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012730.xml (deflated 43%) 2022-11-23T02:48:26.3632257Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012732.xml (deflated 41%) 2022-11-23T02:48:26.3633038Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012735.xml (deflated 41%) 2022-11-23T02:48:26.3633817Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012737.xml (deflated 41%) 2022-11-23T02:48:26.3634579Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012739.xml (deflated 41%) 2022-11-23T02:48:26.3635862Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012742.xml (deflated 41%) 2022-11-23T02:48:26.3636649Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012744.xml (deflated 41%) 2022-11-23T02:48:26.3637425Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012747.xml (deflated 41%) 2022-11-23T02:48:26.3638191Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012749.xml (deflated 40%) 2022-11-23T02:48:26.3638988Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012751.xml (deflated 41%) 2022-11-23T02:48:26.3639763Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012754.xml (deflated 41%) 2022-11-23T02:48:26.3640537Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012756.xml (deflated 40%) 2022-11-23T02:48:26.3641330Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012805.xml (deflated 41%) 2022-11-23T02:48:26.3642088Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012807.xml (deflated 40%) 2022-11-23T02:48:26.3642869Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012816.xml (deflated 42%) 2022-11-23T02:48:26.3643646Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012822.xml (deflated 41%) 2022-11-23T02:48:26.3644425Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012831.xml (deflated 42%) 2022-11-23T02:48:26.3645192Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012835.xml (deflated 42%) 2022-11-23T02:48:26.3645968Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012839.xml (deflated 42%) 2022-11-23T02:48:26.3646743Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012843.xml (deflated 40%) 2022-11-23T02:48:26.3647518Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012852.xml (deflated 40%) 2022-11-23T02:48:26.3648388Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012900.xml (deflated 42%) 2022-11-23T02:48:26.3649245Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012904.xml (deflated 42%) 2022-11-23T02:48:26.3650038Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012909.xml (deflated 40%) 2022-11-23T02:48:26.3650804Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012917.xml (deflated 40%) 2022-11-23T02:48:26.3651570Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012925.xml (deflated 40%) 2022-11-23T02:48:26.3652349Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012933.xml (deflated 40%) 2022-11-23T02:48:26.3653133Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012942.xml (deflated 42%) 2022-11-23T02:48:26.3653910Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012946.xml (deflated 41%) 2022-11-23T02:48:26.3654669Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012954.xml (deflated 42%) 2022-11-23T02:48:26.3655444Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012958.xml (deflated 41%) 2022-11-23T02:48:26.3656219Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013007.xml (deflated 42%) 2022-11-23T02:48:26.3656992Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013011.xml (deflated 42%) 2022-11-23T02:48:26.3657763Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013015.xml (deflated 41%) 2022-11-23T02:48:26.3658533Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013024.xml (deflated 41%) 2022-11-23T02:48:26.3659310Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013033.xml (deflated 42%) 2022-11-23T02:48:26.3660084Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013037.xml (deflated 40%) 2022-11-23T02:48:26.3660876Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013046.xml (deflated 42%) 2022-11-23T02:48:26.3661631Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013048.xml (deflated 42%) 2022-11-23T02:48:26.3662407Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013051.xml (deflated 42%) 2022-11-23T02:48:26.3663189Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013053.xml (deflated 41%) 2022-11-23T02:48:26.3663973Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013055.xml (deflated 41%) 2022-11-23T02:48:26.3664738Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013058.xml (deflated 41%) 2022-11-23T02:48:26.3665511Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013100.xml (deflated 41%) 2022-11-23T02:48:26.3666285Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013103.xml (deflated 41%) 2022-11-23T02:48:26.3667127Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013105.xml (deflated 41%) 2022-11-23T02:48:26.3667887Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013107.xml (deflated 41%) 2022-11-23T02:48:26.3668709Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013110.xml (deflated 42%) 2022-11-23T02:48:26.3669495Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013112.xml (deflated 42%) 2022-11-23T02:48:26.3670273Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013115.xml (deflated 42%) 2022-11-23T02:48:26.3671030Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013119.xml (deflated 41%) 2022-11-23T02:48:26.3671798Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013128.xml (deflated 40%) 2022-11-23T02:48:26.3672575Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013136.xml (deflated 41%) 2022-11-23T02:48:26.3673392Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013144.xml (deflated 40%) 2022-11-23T02:48:26.3674152Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013153.xml (deflated 40%) 2022-11-23T02:48:26.3674929Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013201.xml (deflated 40%) 2022-11-23T02:48:26.3676342Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013219.xml (deflated 41%) 2022-11-23T02:48:26.3677125Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013228.xml (deflated 41%) 2022-11-23T02:48:26.3677888Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013236.xml (deflated 41%) 2022-11-23T02:48:26.3678666Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013245.xml (deflated 41%) 2022-11-23T02:48:26.3679446Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013253.xml (deflated 42%) 2022-11-23T02:48:26.3680219Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013258.xml (deflated 42%) 2022-11-23T02:48:26.3680974Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013302.xml (deflated 42%) 2022-11-23T02:48:26.3681752Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013311.xml (deflated 41%) 2022-11-23T02:48:26.3682538Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013319.xml (deflated 42%) 2022-11-23T02:48:26.3683318Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013323.xml (deflated 41%) 2022-11-23T02:48:26.3684129Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013332.xml (deflated 42%) 2022-11-23T02:48:26.3684890Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013336.xml (deflated 41%) 2022-11-23T02:48:26.3685660Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013346.xml (deflated 41%) 2022-11-23T02:48:26.3686434Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013354.xml (deflated 40%) 2022-11-23T02:48:26.3687313Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013402.xml (deflated 42%) 2022-11-23T02:48:26.3688071Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013406.xml (deflated 42%) 2022-11-23T02:48:26.3688903Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013410.xml (deflated 42%) 2022-11-23T02:48:26.3689693Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013414.xml (deflated 40%) 2022-11-23T02:48:26.3690470Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013423.xml (deflated 41%) 2022-11-23T02:48:26.3691224Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013431.xml (deflated 40%) 2022-11-23T02:48:26.3692007Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013438.xml (deflated 41%) 2022-11-23T02:48:26.3692781Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013444.xml (deflated 42%) 2022-11-23T02:48:26.3693559Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013449.xml (deflated 42%) 2022-11-23T02:48:26.3694317Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013453.xml (deflated 40%) 2022-11-23T02:48:26.3695089Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013500.xml (deflated 41%) 2022-11-23T02:48:26.3695864Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013502.xml (deflated 41%) 2022-11-23T02:48:26.3696637Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013504.xml (deflated 42%) 2022-11-23T02:48:26.3697400Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013507.xml (deflated 41%) 2022-11-23T02:48:26.3698185Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013509.xml (deflated 41%) 2022-11-23T02:48:26.3698960Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013512.xml (deflated 40%) 2022-11-23T02:48:26.3699735Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013514.xml (deflated 40%) 2022-11-23T02:48:26.3700507Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013517.xml (deflated 41%) 2022-11-23T02:48:26.3701265Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013523.xml (deflated 42%) 2022-11-23T02:48:26.3702048Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013526.xml (deflated 40%) 2022-11-23T02:48:26.3702821Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013533.xml (deflated 40%) 2022-11-23T02:48:26.3703594Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013540.xml (deflated 40%) 2022-11-23T02:48:26.3704356Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013547.xml (deflated 40%) 2022-11-23T02:48:26.3705130Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013555.xml (deflated 41%) 2022-11-23T02:48:26.3705909Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013604.xml (deflated 40%) 2022-11-23T02:48:26.3706749Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013612.xml (deflated 41%) 2022-11-23T02:48:26.3707549Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013620.xml (deflated 40%) 2022-11-23T02:48:26.3708340Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013629.xml (deflated 41%) 2022-11-23T02:48:26.3709107Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013653.xml (deflated 40%) 2022-11-23T02:48:26.3709873Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013716.xml (deflated 42%) 2022-11-23T02:48:26.3710634Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013719.xml (deflated 41%) 2022-11-23T02:48:26.3711414Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013721.xml (deflated 41%) 2022-11-23T02:48:26.3712215Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013723.xml (deflated 41%) 2022-11-23T02:48:26.3712985Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013726.xml (deflated 41%) 2022-11-23T02:48:26.3713762Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013728.xml (deflated 42%) 2022-11-23T02:48:26.3714526Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013731.xml (deflated 42%) 2022-11-23T02:48:26.3715817Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013733.xml (deflated 42%) 2022-11-23T02:48:26.3716614Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013735.xml (deflated 41%) 2022-11-23T02:48:26.3717388Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013738.xml (deflated 42%) 2022-11-23T02:48:26.3718157Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013740.xml (deflated 42%) 2022-11-23T02:48:26.3718937Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013743.xml (deflated 42%) 2022-11-23T02:48:26.3719711Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013745.xml (deflated 41%) 2022-11-23T02:48:26.3720485Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013747.xml (deflated 40%) 2022-11-23T02:48:26.3721245Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013754.xml (deflated 40%) 2022-11-23T02:48:26.3722024Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013801.xml (deflated 41%) 2022-11-23T02:48:26.3722798Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013803.xml (deflated 42%) 2022-11-23T02:48:26.3723574Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013806.xml (deflated 42%) 2022-11-23T02:48:26.3724339Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013810.xml (deflated 40%) 2022-11-23T02:48:26.3725110Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013819.xml (deflated 41%) 2022-11-23T02:48:26.3725883Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013827.xml (deflated 40%) 2022-11-23T02:48:26.3726761Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013835.xml (deflated 42%) 2022-11-23T02:48:26.3727583Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013840.xml (deflated 41%) 2022-11-23T02:48:26.3728374Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013844.xml (deflated 40%) 2022-11-23T02:48:26.3729149Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013851.xml (deflated 40%) 2022-11-23T02:48:26.3729923Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013857.xml (deflated 42%) 2022-11-23T02:48:26.3730686Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013902.xml (deflated 40%) 2022-11-23T02:48:26.3731457Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013910.xml (deflated 41%) 2022-11-23T02:48:26.3732242Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013918.xml (deflated 40%) 2022-11-23T02:48:26.3733018Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013927.xml (deflated 40%) 2022-11-23T02:48:26.3733781Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013935.xml (deflated 42%) 2022-11-23T02:48:26.3734554Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013942.xml (deflated 42%) 2022-11-23T02:48:26.3735328Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013949.xml (deflated 42%) 2022-11-23T02:48:26.3736115Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013956.xml (deflated 42%) 2022-11-23T02:48:26.3736876Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014002.xml (deflated 41%) 2022-11-23T02:48:26.3737656Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014011.xml (deflated 41%) 2022-11-23T02:48:26.3738431Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014019.xml (deflated 42%) 2022-11-23T02:48:26.3739202Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014022.xml (deflated 41%) 2022-11-23T02:48:26.3739966Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014030.xml (deflated 43%) 2022-11-23T02:48:26.3740745Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014032.xml (deflated 43%) 2022-11-23T02:48:26.3741518Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014035.xml (deflated 40%) 2022-11-23T02:48:26.3742294Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014043.xml (deflated 42%) 2022-11-23T02:48:26.3743054Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014046.xml (deflated 42%) 2022-11-23T02:48:26.3743835Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014048.xml (deflated 40%) 2022-11-23T02:48:26.3744609Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014057.xml (deflated 41%) 2022-11-23T02:48:26.3745387Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014059.xml (deflated 41%) 2022-11-23T02:48:26.3746250Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014101.xml (deflated 41%) 2022-11-23T02:48:26.3747062Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014104.xml (deflated 42%) 2022-11-23T02:48:26.3747855Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014106.xml (deflated 41%) 2022-11-23T02:48:26.3748629Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014109.xml (deflated 41%) 2022-11-23T02:48:26.3749411Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014111.xml (deflated 41%) 2022-11-23T02:48:26.3750169Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014113.xml (deflated 41%) 2022-11-23T02:48:26.3750956Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014116.xml (deflated 41%) 2022-11-23T02:48:26.3751738Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014124.xml (deflated 42%) 2022-11-23T02:48:26.3752507Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014126.xml (deflated 42%) 2022-11-23T02:48:26.3753267Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014129.xml (deflated 42%) 2022-11-23T02:48:26.3754046Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014131.xml (deflated 41%) 2022-11-23T02:48:26.3754820Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014140.xml (deflated 41%) 2022-11-23T02:48:26.3756159Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014142.xml (deflated 41%) 2022-11-23T02:48:26.3756921Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014144.xml (deflated 41%) 2022-11-23T02:48:26.3757697Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014147.xml (deflated 41%) 2022-11-23T02:48:26.3758477Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014155.xml (deflated 41%) 2022-11-23T02:48:26.3759254Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014203.xml (deflated 40%) 2022-11-23T02:48:26.3760015Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014212.xml (deflated 40%) 2022-11-23T02:48:26.3760802Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014221.xml (deflated 42%) 2022-11-23T02:48:26.3761580Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014223.xml (deflated 42%) 2022-11-23T02:48:26.3762363Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014225.xml (deflated 42%) 2022-11-23T02:48:26.3763121Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014234.xml (deflated 40%) 2022-11-23T02:48:26.3763897Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014242.xml (deflated 42%) 2022-11-23T02:48:26.3764676Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014244.xml (deflated 41%) 2022-11-23T02:48:26.3765555Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014253.xml (deflated 40%) 2022-11-23T02:48:26.3766316Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014306.xml (deflated 40%) 2022-11-23T02:48:26.3767160Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014322.xml (deflated 41%) 2022-11-23T02:48:26.3767956Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014330.xml (deflated 42%) 2022-11-23T02:48:26.3768729Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014333.xml (deflated 42%) 2022-11-23T02:48:26.3769500Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014339.xml (deflated 43%) 2022-11-23T02:48:26.3770260Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014343.xml (deflated 41%) 2022-11-23T02:48:26.3771042Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014352.xml (deflated 41%) 2022-11-23T02:48:26.3771818Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014400.xml (deflated 40%) 2022-11-23T02:48:26.3772594Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014408.xml (deflated 40%) 2022-11-23T02:48:26.3773395Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014417.xml (deflated 40%) 2022-11-23T02:48:26.3774176Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014425.xml (deflated 39%) 2022-11-23T02:48:26.3774946Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014433.xml (deflated 39%) 2022-11-23T02:48:26.3775725Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014442.xml (deflated 40%) 2022-11-23T02:48:26.3776486Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014450.xml (deflated 40%) 2022-11-23T02:48:26.3777258Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014458.xml (deflated 42%) 2022-11-23T02:48:26.3778035Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014503.xml (deflated 41%) 2022-11-23T02:48:26.3778809Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014509.xml (deflated 42%) 2022-11-23T02:48:26.3779573Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014514.xml (deflated 42%) 2022-11-23T02:48:26.3780360Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014522.xml (deflated 41%) 2022-11-23T02:48:26.3781137Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014524.xml (deflated 45%) 2022-11-23T02:48:26.3781912Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014526.xml (deflated 46%) 2022-11-23T02:48:26.3782673Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014529.xml (deflated 48%) 2022-11-23T02:48:26.3783445Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014531.xml (deflated 45%) 2022-11-23T02:48:26.3784219Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014534.xml (deflated 40%) 2022-11-23T02:48:26.3785063Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014542.xml (deflated 43%) 2022-11-23T02:48:26.3785829Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014544.xml (deflated 43%) 2022-11-23T02:48:26.3786670Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014547.xml (deflated 43%) 2022-11-23T02:48:26.3787492Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014549.xml (deflated 43%) 2022-11-23T02:48:26.3788262Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014552.xml (deflated 43%) 2022-11-23T02:48:26.3789038Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014554.xml (deflated 41%) 2022-11-23T02:48:26.3789804Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014602.xml (deflated 42%) 2022-11-23T02:48:26.3790577Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014605.xml (deflated 41%) 2022-11-23T02:48:26.3791351Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014607.xml (deflated 40%) 2022-11-23T02:48:26.3792134Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014616.xml (deflated 41%) 2022-11-23T02:48:26.3792888Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014622.xml (deflated 42%) 2022-11-23T02:48:26.3793668Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014625.xml (deflated 42%) 2022-11-23T02:48:26.3794441Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014627.xml (deflated 42%) 2022-11-23T02:48:26.3795716Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014630.xml (deflated 42%) 2022-11-23T02:48:26.3796508Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014632.xml (deflated 40%) 2022-11-23T02:48:26.3797295Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014641.xml (deflated 40%) 2022-11-23T02:48:26.3798074Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014650.xml (deflated 42%) 2022-11-23T02:48:26.3798853Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014652.xml (deflated 41%) 2022-11-23T02:48:26.3799608Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014654.xml (deflated 41%) 2022-11-23T02:48:26.3800393Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014657.xml (deflated 41%) 2022-11-23T02:48:26.3801170Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014659.xml (deflated 41%) 2022-11-23T02:48:26.3801944Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014702.xml (deflated 41%) 2022-11-23T02:48:26.3802709Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014704.xml (deflated 41%) 2022-11-23T02:48:26.3803486Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014706.xml (deflated 41%) 2022-11-23T02:48:26.3804258Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014709.xml (deflated 41%) 2022-11-23T02:48:26.3805144Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014711.xml (deflated 41%) 2022-11-23T02:48:26.3805966Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014713.xml (deflated 40%) 2022-11-23T02:48:26.3806751Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014720.xml (deflated 41%) 2022-11-23T02:48:26.3807528Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014723.xml (deflated 41%) 2022-11-23T02:48:26.3808299Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014725.xml (deflated 41%) 2022-11-23T02:48:26.3809057Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014728.xml (deflated 40%) 2022-11-23T02:48:26.3809840Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014734.xml (deflated 40%) 2022-11-23T02:48:26.3810620Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014743.xml (deflated 40%) 2022-11-23T02:48:26.3811398Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014751.xml (deflated 40%) 2022-11-23T02:48:26.3812156Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014800.xml (deflated 40%) 2022-11-23T02:48:26.3812936Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014808.xml (deflated 42%) 2022-11-23T02:48:26.3813711Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014815.xml (deflated 42%) 2022-11-23T02:48:26.3814479Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014822.xml (deflated 42%) 2022-11-23T02:48:26.3815244Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014829.xml (deflated 42%) 2022-11-23T02:48:26.3816031Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014836.xml (deflated 41%) 2022-11-23T02:48:26.3816803Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014844.xml (deflated 41%) 2022-11-23T02:48:26.3817582Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014852.xml (deflated 40%) 2022-11-23T02:48:26.3818339Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014901.xml (deflated 40%) 2022-11-23T02:48:26.3819115Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014909.xml (deflated 40%) 2022-11-23T02:48:26.3819900Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014918.xml (deflated 40%) 2022-11-23T02:48:26.3820673Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014926.xml (deflated 41%) 2022-11-23T02:48:26.3821427Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014935.xml (deflated 40%) 2022-11-23T02:48:26.3822205Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014943.xml (deflated 40%) 2022-11-23T02:48:26.3822980Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014951.xml (deflated 41%) 2022-11-23T02:48:26.3823755Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014954.xml (deflated 41%) 2022-11-23T02:48:26.3824592Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014956.xml (deflated 40%) 2022-11-23T02:48:26.3825397Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014959.xml (deflated 42%) 2022-11-23T02:48:26.3826189Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015001.xml (deflated 42%) 2022-11-23T02:48:26.3826960Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015003.xml (deflated 42%) 2022-11-23T02:48:26.3827735Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015006.xml (deflated 41%) 2022-11-23T02:48:26.3828494Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015008.xml (deflated 42%) 2022-11-23T02:48:26.3829272Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015011.xml (deflated 42%) 2022-11-23T02:48:26.3830055Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015013.xml (deflated 42%) 2022-11-23T02:48:26.3830838Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015015.xml (deflated 42%) 2022-11-23T02:48:26.3831594Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015018.xml (deflated 42%) 2022-11-23T02:48:26.3832368Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015020.xml (deflated 42%) 2022-11-23T02:48:26.3833148Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015023.xml (deflated 43%) 2022-11-23T02:48:26.3833929Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015025.xml (deflated 43%) 2022-11-23T02:48:26.3834687Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015027.xml (deflated 42%) 2022-11-23T02:48:26.3835993Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015030.xml (deflated 42%) 2022-11-23T02:48:26.3836774Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015032.xml (deflated 42%) 2022-11-23T02:48:26.3837549Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015035.xml (deflated 43%) 2022-11-23T02:48:26.3838367Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015037.xml (deflated 42%) 2022-11-23T02:48:26.3839123Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015039.xml (deflated 42%) 2022-11-23T02:48:26.3839907Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015042.xml (deflated 42%) 2022-11-23T02:48:26.3840687Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015044.xml (deflated 42%) 2022-11-23T02:48:26.3841463Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015046.xml (deflated 42%) 2022-11-23T02:48:26.3842217Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015049.xml (deflated 42%) 2022-11-23T02:48:26.3842991Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015051.xml (deflated 42%) 2022-11-23T02:48:26.3843811Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015054.xml (deflated 42%) 2022-11-23T02:48:26.3844684Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015056.xml (deflated 41%) 2022-11-23T02:48:26.3845507Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015105.xml (deflated 42%) 2022-11-23T02:48:26.3846298Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015112.xml (deflated 42%) 2022-11-23T02:48:26.3847075Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015114.xml (deflated 42%) 2022-11-23T02:48:26.3847849Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015116.xml (deflated 41%) 2022-11-23T02:48:26.3848607Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015125.xml (deflated 42%) 2022-11-23T02:48:26.3849390Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015128.xml (deflated 42%) 2022-11-23T02:48:26.3850169Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015135.xml (deflated 42%) 2022-11-23T02:48:26.3850940Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015137.xml (deflated 42%) 2022-11-23T02:48:26.3851696Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015144.xml (deflated 42%) 2022-11-23T02:48:26.3852478Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015146.xml (deflated 42%) 2022-11-23T02:48:26.3853254Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015149.xml (deflated 42%) 2022-11-23T02:48:26.3854042Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015151.xml (deflated 41%) 2022-11-23T02:48:26.3854793Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015153.xml (deflated 41%) 2022-11-23T02:48:26.3855571Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015156.xml (deflated 41%) 2022-11-23T02:48:26.3856353Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015158.xml (deflated 41%) 2022-11-23T02:48:26.3857124Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015201.xml (deflated 41%) 2022-11-23T02:48:26.3857898Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015203.xml (deflated 41%) 2022-11-23T02:48:26.3858667Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015205.xml (deflated 41%) 2022-11-23T02:48:26.3859441Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015208.xml (deflated 41%) 2022-11-23T02:48:26.3860222Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015210.xml (deflated 41%) 2022-11-23T02:48:26.3860997Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015213.xml (deflated 41%) 2022-11-23T02:48:26.3861760Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015215.xml (deflated 41%) 2022-11-23T02:48:26.3862535Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015223.xml (deflated 41%) 2022-11-23T02:48:26.3863313Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015226.xml (deflated 40%) 2022-11-23T02:48:26.3864156Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015234.xml (deflated 42%) 2022-11-23T02:48:26.3864985Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015241.xml (deflated 40%) 2022-11-23T02:48:26.3865779Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015249.xml (deflated 42%) 2022-11-23T02:48:26.3866543Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015253.xml (deflated 42%) 2022-11-23T02:48:26.3867322Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015257.xml (deflated 42%) 2022-11-23T02:48:26.3868079Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015302.xml (deflated 40%) 2022-11-23T02:48:26.3868856Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015310.xml (deflated 40%) 2022-11-23T02:48:26.3869628Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015319.xml (deflated 42%) 2022-11-23T02:48:26.3870400Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015323.xml (deflated 42%) 2022-11-23T02:48:26.3871161Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015327.xml (deflated 40%) 2022-11-23T02:48:26.3871944Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015335.xml (deflated 40%) 2022-11-23T02:48:26.3872719Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015343.xml (deflated 40%) 2022-11-23T02:48:26.3873538Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015352.xml (deflated 40%) 2022-11-23T02:48:26.3874303Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015400.xml (deflated 42%) 2022-11-23T02:48:26.3875579Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015405.xml (deflated 40%) 2022-11-23T02:48:26.3876386Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015413.xml (deflated 42%) 2022-11-23T02:48:26.3877159Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015417.xml (deflated 41%) 2022-11-23T02:48:26.3877924Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015425.xml (deflated 42%) 2022-11-23T02:48:26.3878712Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015429.xml (deflated 42%) 2022-11-23T02:48:26.3879491Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015434.xml (deflated 40%) 2022-11-23T02:48:26.3880272Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015442.xml (deflated 40%) 2022-11-23T02:48:26.3881033Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015452.xml (deflated 42%) 2022-11-23T02:48:26.3881801Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015456.xml (deflated 40%) 2022-11-23T02:48:26.3882580Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015504.xml (deflated 42%) 2022-11-23T02:48:26.3883457Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015507.xml (deflated 42%) 2022-11-23T02:48:26.3884219Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015509.xml (deflated 42%) 2022-11-23T02:48:26.3885062Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015512.xml (deflated 41%) 2022-11-23T02:48:26.3885854Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015514.xml (deflated 41%) 2022-11-23T02:48:26.3886630Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015516.xml (deflated 41%) 2022-11-23T02:48:26.3887388Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015519.xml (deflated 41%) 2022-11-23T02:48:26.3888161Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015521.xml (deflated 41%) 2022-11-23T02:48:26.3888947Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015523.xml (deflated 41%) 2022-11-23T02:48:26.3889722Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015526.xml (deflated 41%) 2022-11-23T02:48:26.3890483Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015528.xml (deflated 42%) 2022-11-23T02:48:26.3891263Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015531.xml (deflated 42%) 2022-11-23T02:48:26.3892043Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015533.xml (deflated 42%) 2022-11-23T02:48:26.3892819Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015537.xml (deflated 40%) 2022-11-23T02:48:26.3893599Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015546.xml (deflated 40%) 2022-11-23T02:48:26.3894365Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015554.xml (deflated 40%) 2022-11-23T02:48:26.3895145Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015602.xml (deflated 40%) 2022-11-23T02:48:26.3895919Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015611.xml (deflated 40%) 2022-11-23T02:48:26.3896697Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015619.xml (deflated 40%) 2022-11-23T02:48:26.3897453Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015637.xml (deflated 41%) 2022-11-23T02:48:26.3898282Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015646.xml (deflated 41%) 2022-11-23T02:48:26.3899051Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015654.xml (deflated 41%) 2022-11-23T02:48:26.3899834Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015703.xml (deflated 40%) 2022-11-23T02:48:26.3900607Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015712.xml (deflated 42%) 2022-11-23T02:48:26.3901369Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015716.xml (deflated 42%) 2022-11-23T02:48:26.3902144Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015720.xml (deflated 42%) 2022-11-23T02:48:26.3902992Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015729.xml (deflated 40%) 2022-11-23T02:48:26.3903761Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015737.xml (deflated 42%) 2022-11-23T02:48:26.3904577Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015741.xml (deflated 40%) 2022-11-23T02:48:26.3905368Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015750.xml (deflated 42%) 2022-11-23T02:48:26.3906138Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015754.xml (deflated 41%) 2022-11-23T02:48:26.3906914Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015803.xml (deflated 41%) 2022-11-23T02:48:26.3907684Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015812.xml (deflated 40%) 2022-11-23T02:48:26.3908460Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015820.xml (deflated 42%) 2022-11-23T02:48:26.3909232Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015824.xml (deflated 42%) 2022-11-23T02:48:26.3910007Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015828.xml (deflated 42%) 2022-11-23T02:48:26.3910768Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015832.xml (deflated 41%) 2022-11-23T02:48:26.3911545Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015841.xml (deflated 41%) 2022-11-23T02:48:26.3912316Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015849.xml (deflated 40%) 2022-11-23T02:48:26.3913099Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015856.xml (deflated 40%) 2022-11-23T02:48:26.3913862Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015902.xml (deflated 42%) 2022-11-23T02:48:26.3914643Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015907.xml (deflated 42%) 2022-11-23T02:48:26.3915954Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015911.xml (deflated 40%) 2022-11-23T02:48:26.3916735Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015918.xml (deflated 41%) 2022-11-23T02:48:26.3917490Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015920.xml (deflated 41%) 2022-11-23T02:48:26.3918272Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015922.xml (deflated 41%) 2022-11-23T02:48:26.3919049Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015925.xml (deflated 41%) 2022-11-23T02:48:26.3919817Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015927.xml (deflated 41%) 2022-11-23T02:48:26.3920573Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015930.xml (deflated 40%) 2022-11-23T02:48:26.3921351Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015932.xml (deflated 40%) 2022-11-23T02:48:26.3922126Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015934.xml (deflated 41%) 2022-11-23T02:48:26.3923002Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015941.xml (deflated 42%) 2022-11-23T02:48:26.3923830Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015944.xml (deflated 40%) 2022-11-23T02:48:26.3924619Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015951.xml (deflated 40%) 2022-11-23T02:48:26.3925391Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015958.xml (deflated 40%) 2022-11-23T02:48:26.3926166Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020005.xml (deflated 40%) 2022-11-23T02:48:26.3926926Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020013.xml (deflated 41%) 2022-11-23T02:48:26.3927707Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020021.xml (deflated 41%) 2022-11-23T02:48:26.3928486Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020029.xml (deflated 40%) 2022-11-23T02:48:26.3929266Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020038.xml (deflated 40%) 2022-11-23T02:48:26.3930039Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020046.xml (deflated 41%) 2022-11-23T02:48:26.3930799Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020111.xml (deflated 41%) 2022-11-23T02:48:26.3931569Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020135.xml (deflated 42%) 2022-11-23T02:48:26.3932343Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020137.xml (deflated 42%) 2022-11-23T02:48:26.3933119Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020140.xml (deflated 41%) 2022-11-23T02:48:26.3933882Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020142.xml (deflated 41%) 2022-11-23T02:48:26.3934653Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020144.xml (deflated 41%) 2022-11-23T02:48:26.3935423Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020147.xml (deflated 42%) 2022-11-23T02:48:26.3936204Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020149.xml (deflated 42%) 2022-11-23T02:48:26.3936967Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020152.xml (deflated 42%) 2022-11-23T02:48:26.3937747Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020154.xml (deflated 42%) 2022-11-23T02:48:26.3938522Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020156.xml (deflated 42%) 2022-11-23T02:48:26.3939295Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020159.xml (deflated 42%) 2022-11-23T02:48:26.3940054Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020201.xml (deflated 42%) 2022-11-23T02:48:26.3940828Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020204.xml (deflated 42%) 2022-11-23T02:48:26.3966650Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020206.xml (deflated 40%) 2022-11-23T02:48:26.3967727Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020213.xml (deflated 41%) 2022-11-23T02:48:26.3968606Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020220.xml (deflated 42%) 2022-11-23T02:48:26.3969405Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020222.xml (deflated 42%) 2022-11-23T02:48:26.3970204Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020225.xml (deflated 42%) 2022-11-23T02:48:26.3970992Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020229.xml (deflated 41%) 2022-11-23T02:48:26.3971777Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020237.xml (deflated 41%) 2022-11-23T02:48:26.3972545Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020246.xml (deflated 41%) 2022-11-23T02:48:26.3973349Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020255.xml (deflated 42%) 2022-11-23T02:48:26.3974187Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020259.xml (deflated 41%) 2022-11-23T02:48:26.3974975Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020303.xml (deflated 40%) 2022-11-23T02:48:26.3975745Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020310.xml (deflated 40%) 2022-11-23T02:48:26.3976531Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020316.xml (deflated 42%) 2022-11-23T02:48:26.3977323Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020321.xml (deflated 40%) 2022-11-23T02:48:26.3978111Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020329.xml (deflated 41%) 2022-11-23T02:48:26.3978899Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020338.xml (deflated 40%) 2022-11-23T02:48:26.3979671Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020346.xml (deflated 40%) 2022-11-23T02:48:26.3980449Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020355.xml (deflated 42%) 2022-11-23T02:48:26.3981235Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020401.xml (deflated 42%) 2022-11-23T02:48:26.3982024Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020408.xml (deflated 42%) 2022-11-23T02:48:26.3982794Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020415.xml (deflated 42%) 2022-11-23T02:48:26.3983579Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020422.xml (deflated 41%) 2022-11-23T02:48:26.3984360Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020430.xml (deflated 41%) 2022-11-23T02:48:26.3985144Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020439.xml (deflated 42%) 2022-11-23T02:48:26.3985910Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020441.xml (deflated 41%) 2022-11-23T02:48:26.3986689Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020449.xml (deflated 43%) 2022-11-23T02:48:26.3987541Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020452.xml (deflated 43%) 2022-11-23T02:48:26.3988368Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020454.xml (deflated 40%) 2022-11-23T02:48:26.3989149Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020503.xml (deflated 42%) 2022-11-23T02:48:26.3989929Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020505.xml (deflated 42%) 2022-11-23T02:48:26.3990709Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020507.xml (deflated 41%) 2022-11-23T02:48:26.3991488Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020516.xml (deflated 41%) 2022-11-23T02:48:26.3992263Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020518.xml (deflated 41%) 2022-11-23T02:48:26.3993051Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020521.xml (deflated 41%) 2022-11-23T02:48:26.3993841Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020523.xml (deflated 42%) 2022-11-23T02:48:26.3994625Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020526.xml (deflated 41%) 2022-11-23T02:48:26.3995929Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020528.xml (deflated 41%) 2022-11-23T02:48:26.3996764Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020530.xml (deflated 41%) 2022-11-23T02:48:26.3997561Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020533.xml (deflated 41%) 2022-11-23T02:48:26.3998351Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020535.xml (deflated 41%) 2022-11-23T02:48:26.3999138Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020543.xml (deflated 42%) 2022-11-23T02:48:26.3999903Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020546.xml (deflated 42%) 2022-11-23T02:48:26.4000692Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020548.xml (deflated 42%) 2022-11-23T02:48:26.4001470Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020550.xml (deflated 41%) 2022-11-23T02:48:26.4002255Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020559.xml (deflated 41%) 2022-11-23T02:48:26.4003016Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020601.xml (deflated 41%) 2022-11-23T02:48:26.4003811Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020604.xml (deflated 41%) 2022-11-23T02:48:26.4004588Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020606.xml (deflated 40%) 2022-11-23T02:48:26.4005370Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020615.xml (deflated 41%) 2022-11-23T02:48:26.4006130Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020623.xml (deflated 40%) 2022-11-23T02:48:26.4006906Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020631.xml (deflated 40%) 2022-11-23T02:48:26.4007791Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020640.xml (deflated 42%) 2022-11-23T02:48:26.4008628Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020642.xml (deflated 42%) 2022-11-23T02:48:26.4009410Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020644.xml (deflated 41%) 2022-11-23T02:48:26.4010181Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020653.xml (deflated 40%) 2022-11-23T02:48:26.4010962Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020701.xml (deflated 41%) 2022-11-23T02:48:26.4011737Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020703.xml (deflated 41%) 2022-11-23T02:48:26.4012501Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020712.xml (deflated 41%) 2022-11-23T02:48:26.4013287Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020725.xml (deflated 40%) 2022-11-23T02:48:26.4014091Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_dedup_tensors/TEST-TestDedupTensor-20221123020807.xml (deflated 40%) 2022-11-23T02:48:26.4014880Z adding: test/test-reports/python-unittest/distributed._composable.test_checkpoint/TEST-TestCheckpoint-20221123020815.xml (deflated 55%) 2022-11-23T02:48:26.4015658Z adding: test/test-reports/python-unittest/distributed.test_launcher/TEST-TestDistributedLaunch-20221123020818.xml (deflated 43%) 2022-11-23T02:48:26.4016421Z adding: test/test-reports/python-unittest/distributed.elastic.metrics.api_test/TEST-MetricsApiTest-20221123020822.xml (deflated 63%) 2022-11-23T02:48:26.4017253Z adding: test/test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20221123020826.xml (deflated 52%) 2022-11-23T02:48:26.4018173Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype/TEST-TestShardedTensorMegatronLinear-20221123020832.xml (deflated 44%) 2022-11-23T02:48:26.4019121Z adding: test/test-reports/python-unittest/distributed._tensor.parallel.test_view_sharding_dim_change/TEST-TPViewShardingDimChangeTest-20221123020839.xml (deflated 43%) 2022-11-23T02:48:26.4019931Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20221123020846.xml (deflated 55%) 2022-11-23T02:48:26.4020732Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerServerTest-20221123020854.xml (deflated 66%) 2022-11-23T02:48:26.4021555Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerTest-20221123020854.xml (deflated 69%) 2022-11-23T02:48:26.4022431Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-MultiprocessingRequestQueueTest-20221123020854.xml (deflated 66%) 2022-11-23T02:48:26.4023335Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag/TEST-TestShardedEmbeddingBag-20221123020902.xml (deflated 60%) 2022-11-23T02:48:26.4024189Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax/TEST-TestShardedSoftmax-20221123020911.xml (deflated 59%) 2022-11-23T02:48:26.4024959Z adding: test/test-reports/python-unittest/distributed._tensor.test_view_ops/TEST-TestViewOps-20221123020920.xml (deflated 51%) 2022-11-23T02:48:26.4025673Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20221123020929.xml (deflated 57%) 2022-11-23T02:48:26.4026517Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20221123020940.xml (deflated 69%) 2022-11-23T02:48:26.4027452Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20221123020951.xml (deflated 73%) 2022-11-23T02:48:26.4028355Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeOne-20221123021004.xml (deflated 43%) 2022-11-23T02:48:26.4029228Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_overlap/TEST-TestForwardOverlapWorldSizeTwo-20221123021004.xml (deflated 43%) 2022-11-23T02:48:26.4030127Z adding: test/test-reports/python-unittest/distributed._tensor.parallel.test_tp_examples/TEST-DistTensorParallelExampleTest-20221123021020.xml (deflated 65%) 2022-11-23T02:48:26.4031042Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedReshardOnLoad-20221123021035.xml (deflated 68%) 2022-11-23T02:48:26.4032002Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoad-20221123021035.xml (deflated 43%) 2022-11-23T02:48:26.4033054Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20221123021035.xml (deflated 44%) 2022-11-23T02:48:26.4034000Z adding: test/test-reports/python-unittest/distributed._tensor.test_pointwise_ops/TEST-DistElementwiseOpsTest-20221123021051.xml (deflated 68%) 2022-11-23T02:48:26.4034771Z adding: test/test-reports/python-unittest/distributed.test_dynamo_distributed/TEST-TestDistributed-20221123021109.xml (deflated 87%) 2022-11-23T02:48:26.4036081Z adding: test/test-reports/python-unittest/distributed.test_dynamo_distributed/TEST-TestDistributedMultiProc-20221123021109.xml (deflated 73%) 2022-11-23T02:48:26.4036934Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20221123021126.xml (deflated 75%) 2022-11-23T02:48:26.4037777Z adding: test/test-reports/python-unittest/distributed._tensor.parallel.test_tp_style/TEST-TensorParallelStyleTest-20221123021149.xml (deflated 82%) 2022-11-23T02:48:26.4038688Z adding: test/test-reports/python-unittest/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks/TEST-DistributedDataParallelCommHookTest-20221123021212.xml (deflated 79%) 2022-11-23T02:48:26.4039637Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20221123021241.xml (deflated 86%) 2022-11-23T02:48:26.4040461Z adding: test/test-reports/python-unittest/distributed._tensor.test_common_rules/TEST-CommonRulesTest-20221123021312.xml (deflated 84%) 2022-11-23T02:48:26.4041224Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20221123021343.xml (deflated 91%) 2022-11-23T02:48:26.4041952Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-CommTest-20221123021420.xml (deflated 38%) 2022-11-23T02:48:26.4042698Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021426.xml (deflated 41%) 2022-11-23T02:48:26.4043521Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021430.xml (deflated 40%) 2022-11-23T02:48:26.4044333Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021434.xml (deflated 40%) 2022-11-23T02:48:26.4045142Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ComputeBucketAssignmentTest-20221123021438.xml (deflated 41%) 2022-11-23T02:48:26.4045984Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021442.xml (deflated 42%) 2022-11-23T02:48:26.4046956Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021448.xml (deflated 41%) 2022-11-23T02:48:26.4047809Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021457.xml (deflated 41%) 2022-11-23T02:48:26.4048723Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-PythonProcessGroupExtensionTest-20221123021503.xml (deflated 41%) 2022-11-23T02:48:26.4049492Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123021511.xml (deflated 39%) 2022-11-23T02:48:26.4050198Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123021515.xml (deflated 39%) 2022-11-23T02:48:26.4050888Z adding: test/test-reports/python-unittest/distributed.test_c10d_common/TEST-ReduceOpTest-20221123021519.xml (deflated 39%) 2022-11-23T02:48:26.4051657Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20221123021523.xml (deflated 85%) 2022-11-23T02:48:26.4052465Z adding: test/test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshCollectiveTest-20221123021601.xml (deflated 88%) 2022-11-23T02:48:26.4053252Z adding: test/test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshTest-20221123021601.xml (deflated 73%) 2022-11-23T02:48:26.4054045Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021657.xml (deflated 40%) 2022-11-23T02:48:26.4054887Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021703.xml (deflated 40%) 2022-11-23T02:48:26.4055692Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021710.xml (deflated 41%) 2022-11-23T02:48:26.4056518Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021716.xml (deflated 41%) 2022-11-23T02:48:26.4057338Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021723.xml (deflated 40%) 2022-11-23T02:48:26.4058157Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021729.xml (deflated 40%) 2022-11-23T02:48:26.4058962Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021736.xml (deflated 41%) 2022-11-23T02:48:26.4059778Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021742.xml (deflated 41%) 2022-11-23T02:48:26.4060585Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123021749.xml (deflated 40%) 2022-11-23T02:48:26.4061395Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021755.xml (deflated 39%) 2022-11-23T02:48:26.4062194Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021801.xml (deflated 40%) 2022-11-23T02:48:26.4062999Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021809.xml (deflated 39%) 2022-11-23T02:48:26.4063825Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021816.xml (deflated 40%) 2022-11-23T02:48:26.4064634Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123021824.xml (deflated 39%) 2022-11-23T02:48:26.4065420Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20221123021832.xml (deflated 91%) 2022-11-23T02:48:26.4066228Z adding: test/test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkSubclass-20221123022015.xml (deflated 84%) 2022-11-23T02:48:26.4067078Z adding: test/test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkWrapper-20221123022015.xml (deflated 84%) 2022-11-23T02:48:26.4067988Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParams-20221123022221.xml (deflated 91%) 2022-11-23T02:48:26.4068869Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParamsNoShard-20221123022221.xml (deflated 43%) 2022-11-23T02:48:26.4069646Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022527.xml (deflated 38%) 2022-11-23T02:48:26.4070310Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022533.xml (deflated 38%) 2022-11-23T02:48:26.4070988Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022540.xml (deflated 38%) 2022-11-23T02:48:26.4071671Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022547.xml (deflated 38%) 2022-11-23T02:48:26.4072343Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022553.xml (deflated 38%) 2022-11-23T02:48:26.4073013Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022600.xml (deflated 39%) 2022-11-23T02:48:26.4073722Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022607.xml (deflated 39%) 2022-11-23T02:48:26.4074389Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022614.xml (deflated 38%) 2022-11-23T02:48:26.4075455Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022620.xml (deflated 38%) 2022-11-23T02:48:26.4076241Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022626.xml (deflated 37%) 2022-11-23T02:48:26.4076926Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123022633.xml (deflated 38%) 2022-11-23T02:48:26.4077618Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022639.xml (deflated 38%) 2022-11-23T02:48:26.4078308Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022646.xml (deflated 37%) 2022-11-23T02:48:26.4079009Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022653.xml (deflated 38%) 2022-11-23T02:48:26.4079707Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022659.xml (deflated 38%) 2022-11-23T02:48:26.4080400Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022706.xml (deflated 38%) 2022-11-23T02:48:26.4081071Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022713.xml (deflated 38%) 2022-11-23T02:48:26.4081761Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022720.xml (deflated 39%) 2022-11-23T02:48:26.4082454Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022726.xml (deflated 38%) 2022-11-23T02:48:26.4083144Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022733.xml (deflated 38%) 2022-11-23T02:48:26.4083815Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022740.xml (deflated 38%) 2022-11-23T02:48:26.4084496Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123022746.xml (deflated 38%) 2022-11-23T02:48:26.4085263Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022753.xml (deflated 45%) 2022-11-23T02:48:26.4086448Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022801.xml (deflated 45%) 2022-11-23T02:48:26.4087245Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022809.xml (deflated 43%) 2022-11-23T02:48:26.4088147Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022816.xml (deflated 43%) 2022-11-23T02:48:26.4088959Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022824.xml (deflated 45%) 2022-11-23T02:48:26.4089757Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022832.xml (deflated 45%) 2022-11-23T02:48:26.4090551Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022840.xml (deflated 47%) 2022-11-23T02:48:26.4091364Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022848.xml (deflated 47%) 2022-11-23T02:48:26.4092159Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022855.xml (deflated 44%) 2022-11-23T02:48:26.4092963Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022903.xml (deflated 46%) 2022-11-23T02:48:26.4093746Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022911.xml (deflated 46%) 2022-11-23T02:48:26.4094550Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022919.xml (deflated 44%) 2022-11-23T02:48:26.4095344Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022926.xml (deflated 43%) 2022-11-23T02:48:26.4096143Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022934.xml (deflated 43%) 2022-11-23T02:48:26.4096952Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022940.xml (deflated 44%) 2022-11-23T02:48:26.4097738Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022948.xml (deflated 45%) 2022-11-23T02:48:26.4098527Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123022954.xml (deflated 44%) 2022-11-23T02:48:26.4099327Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023000.xml (deflated 46%) 2022-11-23T02:48:26.4100126Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023007.xml (deflated 45%) 2022-11-23T02:48:26.4100909Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023013.xml (deflated 50%) 2022-11-23T02:48:26.4101709Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023020.xml (deflated 42%) 2022-11-23T02:48:26.4102506Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023028.xml (deflated 42%) 2022-11-23T02:48:26.4103307Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023035.xml (deflated 41%) 2022-11-23T02:48:26.4104085Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023042.xml (deflated 41%) 2022-11-23T02:48:26.4104874Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023050.xml (deflated 42%) 2022-11-23T02:48:26.4105672Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023057.xml (deflated 42%) 2022-11-23T02:48:26.4106523Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023104.xml (deflated 42%) 2022-11-23T02:48:26.4107343Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023110.xml (deflated 42%) 2022-11-23T02:48:26.4108144Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023117.xml (deflated 41%) 2022-11-23T02:48:26.4108939Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023123.xml (deflated 44%) 2022-11-23T02:48:26.4109738Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023129.xml (deflated 45%) 2022-11-23T02:48:26.4110510Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023136.xml (deflated 41%) 2022-11-23T02:48:26.4111306Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023143.xml (deflated 41%) 2022-11-23T02:48:26.4112105Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023149.xml (deflated 41%) 2022-11-23T02:48:26.4112902Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023157.xml (deflated 42%) 2022-11-23T02:48:26.4113678Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023203.xml (deflated 42%) 2022-11-23T02:48:26.4114461Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023210.xml (deflated 41%) 2022-11-23T02:48:26.4115782Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023218.xml (deflated 41%) 2022-11-23T02:48:26.4116716Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023226.xml (deflated 42%) 2022-11-23T02:48:26.4117686Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023232.xml (deflated 43%) 2022-11-23T02:48:26.4118642Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023238.xml (deflated 44%) 2022-11-23T02:48:26.4119607Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123023244.xml (deflated 42%) 2022-11-23T02:48:26.4120458Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023250.xml (deflated 39%) 2022-11-23T02:48:26.4121205Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023257.xml (deflated 39%) 2022-11-23T02:48:26.4121937Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023831.xml (deflated 39%) 2022-11-23T02:48:26.4122671Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023305.xml (deflated 39%) 2022-11-23T02:48:26.4123426Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023311.xml (deflated 40%) 2022-11-23T02:48:26.4124165Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023318.xml (deflated 40%) 2022-11-23T02:48:26.4124916Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023324.xml (deflated 39%) 2022-11-23T02:48:26.4125669Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023331.xml (deflated 40%) 2022-11-23T02:48:26.4126523Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023338.xml (deflated 39%) 2022-11-23T02:48:26.4127259Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023347.xml (deflated 40%) 2022-11-23T02:48:26.4128065Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023354.xml (deflated 40%) 2022-11-23T02:48:26.4128830Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023401.xml (deflated 39%) 2022-11-23T02:48:26.4129586Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023409.xml (deflated 39%) 2022-11-23T02:48:26.4130319Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023415.xml (deflated 40%) 2022-11-23T02:48:26.4131067Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023422.xml (deflated 40%) 2022-11-23T02:48:26.4131824Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023428.xml (deflated 40%) 2022-11-23T02:48:26.4132581Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023435.xml (deflated 40%) 2022-11-23T02:48:26.4133312Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023441.xml (deflated 39%) 2022-11-23T02:48:26.4134038Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023449.xml (deflated 40%) 2022-11-23T02:48:26.4134758Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023456.xml (deflated 40%) 2022-11-23T02:48:26.4135484Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023502.xml (deflated 40%) 2022-11-23T02:48:26.4136208Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023510.xml (deflated 40%) 2022-11-23T02:48:26.4136930Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023517.xml (deflated 40%) 2022-11-23T02:48:26.4137672Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023523.xml (deflated 40%) 2022-11-23T02:48:26.4138405Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023531.xml (deflated 39%) 2022-11-23T02:48:26.4139129Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023537.xml (deflated 40%) 2022-11-23T02:48:26.4139844Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023544.xml (deflated 40%) 2022-11-23T02:48:26.4140572Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023552.xml (deflated 40%) 2022-11-23T02:48:26.4141310Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023558.xml (deflated 39%) 2022-11-23T02:48:26.4142036Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023605.xml (deflated 39%) 2022-11-23T02:48:26.4142751Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023612.xml (deflated 39%) 2022-11-23T02:48:26.4143478Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023619.xml (deflated 39%) 2022-11-23T02:48:26.4144263Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023625.xml (deflated 39%) 2022-11-23T02:48:26.4145005Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023633.xml (deflated 39%) 2022-11-23T02:48:26.4145803Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023642.xml (deflated 39%) 2022-11-23T02:48:26.4146595Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023649.xml (deflated 39%) 2022-11-23T02:48:26.4147352Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023655.xml (deflated 40%) 2022-11-23T02:48:26.4148094Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023703.xml (deflated 40%) 2022-11-23T02:48:26.4148826Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023709.xml (deflated 39%) 2022-11-23T02:48:26.4149571Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023716.xml (deflated 40%) 2022-11-23T02:48:26.4150327Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023725.xml (deflated 39%) 2022-11-23T02:48:26.4151080Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023731.xml (deflated 40%) 2022-11-23T02:48:26.4151819Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023738.xml (deflated 40%) 2022-11-23T02:48:26.4152570Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023744.xml (deflated 39%) 2022-11-23T02:48:26.4153318Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023752.xml (deflated 39%) 2022-11-23T02:48:26.4154064Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023759.xml (deflated 39%) 2022-11-23T02:48:26.4154815Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023806.xml (deflated 40%) 2022-11-23T02:48:26.4156087Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023808.xml (deflated 40%) 2022-11-23T02:48:26.4156850Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023815.xml (deflated 41%) 2022-11-23T02:48:26.4157598Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023817.xml (deflated 40%) 2022-11-23T02:48:26.4158326Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123023824.xml (deflated 40%) 2022-11-23T02:48:26.4159050Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023833.xml (deflated 39%) 2022-11-23T02:48:26.4159740Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023835.xml (deflated 39%) 2022-11-23T02:48:26.4160432Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023837.xml (deflated 39%) 2022-11-23T02:48:26.4161098Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023839.xml (deflated 38%) 2022-11-23T02:48:26.4161772Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123023842.xml (deflated 39%) 2022-11-23T02:48:26.4162480Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20221123023844.xml (deflated 39%) 2022-11-23T02:48:26.4163176Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20221123023848.xml (deflated 41%) 2022-11-23T02:48:26.4163849Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20221123023852.xml (deflated 79%) 2022-11-23T02:48:26.4164564Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20221123023852.xml (deflated 64%) 2022-11-23T02:48:26.4165385Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20221123023852.xml (deflated 61%) 2022-11-23T02:48:26.4166188Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20221123023852.xml (deflated 91%) 2022-11-23T02:48:26.4186374Z ##[group]Run # Remove any previous test reports if they exist 2022-11-23T02:48:26.4186754Z # Remove any previous test reports if they exist 2022-11-23T02:48:26.4187081Z rm -f usage-log-*.zip 2022-11-23T02:48:26.4187457Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2022-11-23T02:48:26.4187860Z # so check to see if the file exists first 2022-11-23T02:48:26.4188159Z if [ -f 'usage_log.txt' ]; then 2022-11-23T02:48:26.4188503Z  zip "usage-log-${FILE_SUFFIX}.zip" 'usage_log.txt' 2022-11-23T02:48:26.4188813Z fi 2022-11-23T02:48:26.4200932Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:26.4201236Z env: 2022-11-23T02:48:26.4201486Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:26.4201742Z GPU_FLAG: --gpus all 2022-11-23T02:48:26.4202132Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:26.4202615Z FILE_SUFFIX: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221 2022-11-23T02:48:26.4202960Z ##[endgroup] 2022-11-23T02:48:26.4981427Z adding: usage_log.txt (deflated 95%) 2022-11-23T02:48:26.5028071Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T02:48:26.5028360Z with: 2022-11-23T02:48:26.5028624Z s3-prefix: pytorch/pytorch/3528293554/1/artifact 2022-11-23T02:48:26.5028918Z retention-days: 14 2022-11-23T02:48:26.5029194Z if-no-files-found: warn 2022-11-23T02:48:26.5029454Z path: test-jsons-*.zip 2022-11-23T02:48:26.5029708Z name: artifact 2022-11-23T02:48:26.5029964Z s3-bucket: gha-artifacts 2022-11-23T02:48:26.5030224Z region: us-east-1 2022-11-23T02:48:26.5030449Z env: 2022-11-23T02:48:26.5030688Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:26.5030939Z GPU_FLAG: --gpus all 2022-11-23T02:48:26.5031318Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:26.5031687Z ##[endgroup] 2022-11-23T02:48:26.9469586Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T02:48:26.9470260Z With the provided path, there will be 1 file uploaded 2022-11-23T02:48:26.9470635Z Uploading to s3 prefix: pytorch/pytorch/3528293554/1/artifact 2022-11-23T02:48:26.9482082Z Starting upload of test-jsons-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221.zip 2022-11-23T02:48:27.1273690Z Finished upload of test-jsons-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221.zip 2022-11-23T02:48:27.1440648Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T02:48:27.1440951Z with: 2022-11-23T02:48:27.1441218Z s3-prefix: pytorch/pytorch/3528293554/1/artifact 2022-11-23T02:48:27.1441537Z retention-days: 14 2022-11-23T02:48:27.1441813Z if-no-files-found: error 2022-11-23T02:48:27.1442080Z path: test-reports-*.zip 2022-11-23T02:48:27.1442337Z name: artifact 2022-11-23T02:48:27.1442593Z s3-bucket: gha-artifacts 2022-11-23T02:48:27.1442841Z region: us-east-1 2022-11-23T02:48:27.1443089Z env: 2022-11-23T02:48:27.1443330Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:27.1443581Z GPU_FLAG: --gpus all 2022-11-23T02:48:27.1443962Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:27.1444327Z ##[endgroup] 2022-11-23T02:48:27.5940253Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T02:48:27.5941491Z With the provided path, there will be 1 file uploaded 2022-11-23T02:48:27.5941856Z Uploading to s3 prefix: pytorch/pytorch/3528293554/1/artifact 2022-11-23T02:48:27.5953678Z Starting upload of test-reports-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221.zip 2022-11-23T02:48:27.7757959Z Finished upload of test-reports-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221.zip 2022-11-23T02:48:27.7934649Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T02:48:27.7934954Z with: 2022-11-23T02:48:27.7935247Z s3-prefix: pytorch/pytorch/3528293554/1/artifact 2022-11-23T02:48:27.7935533Z retention-days: 14 2022-11-23T02:48:27.7935930Z if-no-files-found: ignore 2022-11-23T02:48:27.7936229Z path: usage-log-*.zip 2022-11-23T02:48:27.7936467Z name: artifact 2022-11-23T02:48:27.7936728Z s3-bucket: gha-artifacts 2022-11-23T02:48:27.7936999Z region: us-east-1 2022-11-23T02:48:27.7937234Z env: 2022-11-23T02:48:27.7937461Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:27.7937734Z GPU_FLAG: --gpus all 2022-11-23T02:48:27.7938117Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:27.7938467Z ##[endgroup] 2022-11-23T02:48:28.2335651Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T02:48:28.2336579Z With the provided path, there will be 1 file uploaded 2022-11-23T02:48:28.2336986Z Uploading to s3 prefix: pytorch/pytorch/3528293554/1/artifact 2022-11-23T02:48:28.2349975Z Starting upload of usage-log-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221.zip 2022-11-23T02:48:28.3823474Z Finished upload of usage-log-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655199221.zip 2022-11-23T02:48:28.3990004Z ##[group]Run # shellcheck disable=SC2156 2022-11-23T02:48:28.3990369Z # shellcheck disable=SC2156 2022-11-23T02:48:28.3990795Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2022-11-23T02:48:28.4004660Z shell: /usr/bin/bash -e {0} 2022-11-23T02:48:28.4004905Z env: 2022-11-23T02:48:28.4005157Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:28.4005437Z GPU_FLAG: --gpus all 2022-11-23T02:48:28.4005808Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:28.4006188Z ##[endgroup] 2022-11-23T02:48:28.7082962Z ##[group]Run set -x 2022-11-23T02:48:28.7083272Z set -x 2022-11-23T02:48:28.7083564Z python3 -m pip install -r requirements.txt 2022-11-23T02:48:28.7083916Z python3 -m pip install boto3==1.19.12 2022-11-23T02:48:28.7084326Z python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-11-23T02:48:28.7096688Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:28.7096994Z env: 2022-11-23T02:48:28.7097247Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:28.7097506Z GPU_FLAG: --gpus all 2022-11-23T02:48:28.7097893Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:28.7098281Z AWS_DEFAULT_REGION: us-east-1 2022-11-23T02:48:28.7098550Z BRANCH: master 2022-11-23T02:48:28.7098791Z TEST_CONFIG: distributed 2022-11-23T02:48:28.7099052Z SHARD_NUMBER: 3 2022-11-23T02:48:28.7099377Z BUILD_ENVIRONMENT: linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T02:48:28.7099733Z PR_NUMBER: 2022-11-23T02:48:28.7099981Z PYTORCH_RETRY_TEST_CASES: 1 2022-11-23T02:48:28.7100274Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-11-23T02:48:28.7100592Z SHA1: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T02:48:28.7100851Z TAG: 2022-11-23T02:48:28.7101087Z WORKFLOW_ID: 3528293554 2022-11-23T02:48:28.7101531Z GITHUB_TOKEN: *** 2022-11-23T02:48:28.7101804Z GHA_WORKFLOW_JOB_ID: 9655199221 2022-11-23T02:48:28.7102055Z ##[endgroup] 2022-11-23T02:48:28.7130918Z + python3 -m pip install -r requirements.txt 2022-11-23T02:48:29.0165458Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T02:48:29.0542008Z Requirement already satisfied: astunparse in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 2)) (1.6.3) 2022-11-23T02:48:29.0582105Z Requirement already satisfied: expecttest in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (0.1.4) 2022-11-23T02:48:29.0594319Z Requirement already satisfied: future in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 4)) (0.18.2) 2022-11-23T02:48:29.0607096Z Requirement already satisfied: hypothesis in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (6.58.0) 2022-11-23T02:48:29.1165089Z Requirement already satisfied: numpy in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 6)) (1.21.6) 2022-11-23T02:48:29.1177802Z Requirement already satisfied: psutil in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (5.9.1) 2022-11-23T02:48:29.1293534Z Requirement already satisfied: pyyaml in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 8)) (6.0) 2022-11-23T02:48:29.1305153Z Requirement already satisfied: requests in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (2.26.0) 2022-11-23T02:48:29.1564975Z Requirement already satisfied: setuptools in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 10)) (49.1.3) 2022-11-23T02:48:29.1817685Z Requirement already satisfied: six in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 11)) (1.16.0) 2022-11-23T02:48:29.1830227Z Requirement already satisfied: types-dataclasses in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 12)) (0.6.6) 2022-11-23T02:48:29.1839293Z Requirement already satisfied: typing_extensions in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 13)) (4.4.0) 2022-11-23T02:48:29.1853974Z Requirement already satisfied: sympy in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 14)) (1.10.1) 2022-11-23T02:48:29.1881007Z Requirement already satisfied: filelock in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 15)) (3.8.0) 2022-11-23T02:48:29.1987800Z Requirement already satisfied: networkx in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 16)) (2.6.3) 2022-11-23T02:48:29.2226818Z Requirement already satisfied: jinja2 in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 17)) (3.1.2) 2022-11-23T02:48:29.2260840Z Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ec2-user/.local/lib/python3.7/site-packages (from astunparse->-r requirements.txt (line 2)) (0.38.4) 2022-11-23T02:48:29.2286978Z Requirement already satisfied: sortedcontainers<3.0.0,>=2.1.0 in /home/ec2-user/.local/lib/python3.7/site-packages (from hypothesis->-r requirements.txt (line 5)) (2.4.0) 2022-11-23T02:48:29.2301565Z Requirement already satisfied: attrs>=19.2.0 in /home/ec2-user/.local/lib/python3.7/site-packages (from hypothesis->-r requirements.txt (line 5)) (22.1.0) 2022-11-23T02:48:29.2678305Z Requirement already satisfied: exceptiongroup>=1.0.0; python_version < "3.11" in /home/ec2-user/.local/lib/python3.7/site-packages (from hypothesis->-r requirements.txt (line 5)) (1.0.4) 2022-11-23T02:48:29.2701921Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2022.9.24) 2022-11-23T02:48:29.2714649Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (1.26.12) 2022-11-23T02:48:29.2945662Z Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2.0.12) 2022-11-23T02:48:29.2973134Z Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (3.4) 2022-11-23T02:48:29.2989739Z Requirement already satisfied: mpmath>=0.19 in /home/ec2-user/.local/lib/python3.7/site-packages (from sympy->-r requirements.txt (line 14)) (1.2.1) 2022-11-23T02:48:29.3071064Z Requirement already satisfied: MarkupSafe>=2.0 in /home/ec2-user/.local/lib/python3.7/site-packages (from jinja2->-r requirements.txt (line 17)) (2.1.1) 2022-11-23T02:48:29.3787927Z + python3 -m pip install boto3==1.19.12 2022-11-23T02:48:29.6695982Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T02:48:29.6925724Z Requirement already satisfied: boto3==1.19.12 in /home/ec2-user/.local/lib/python3.7/site-packages (1.19.12) 2022-11-23T02:48:29.6994554Z Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /home/ec2-user/.local/lib/python3.7/site-packages (from boto3==1.19.12) (0.5.2) 2022-11-23T02:48:29.7032099Z Requirement already satisfied: botocore<1.23.0,>=1.22.12 in /home/ec2-user/.local/lib/python3.7/site-packages (from boto3==1.19.12) (1.22.12) 2022-11-23T02:48:29.7100094Z Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from boto3==1.19.12) (0.10.0) 2022-11-23T02:48:29.7117988Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /home/ec2-user/.local/lib/python3.7/site-packages (from botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.26.12) 2022-11-23T02:48:29.7335692Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from botocore<1.23.0,>=1.22.12->boto3==1.19.12) (2.8.2) 2022-11-23T02:48:29.7363876Z Requirement already satisfied: six>=1.5 in /home/ec2-user/.local/lib/python3.7/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.16.0) 2022-11-23T02:48:29.9816518Z + python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-11-23T02:48:45.0178912Z [scribe] Scribe access token not provided, sending report via boto3... 2022-11-23T02:48:45.0179228Z 2022-11-23T02:48:45.0179575Z ----- Historic stats comparison result ------ 2022-11-23T02:48:45.0179790Z 2022-11-23T02:48:45.0180041Z job: linux-bionic-cuda11.7-py3.10-gcc7 2022-11-23T02:48:45.0180738Z commit: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T02:48:45.0180954Z 2022-11-23T02:48:45.0181165Z Commit graph (base is most recent master ancestor with at least one S3 report): 2022-11-23T02:48:45.0181430Z 2022-11-23T02:48:45.0181536Z : (master) 2022-11-23T02:48:45.0181768Z | 2022-11-23T02:48:45.0182026Z * 1cfd3858ac (HEAD) total time 2745.80s 2022-11-23T02:48:45.0185249Z * 26322544b8 (base) 13 reports, total time 4193.12s ± 2023.52s 2022-11-23T02:48:45.0185751Z * 7f4b4d2827 13 reports, total time 4157.15s ± 1970.21s 2022-11-23T02:48:45.0186195Z * b50699f247 13 reports, total time 4190.33s ± 2008.01s 2022-11-23T02:48:45.0186617Z * 8bf8e4d71e 13 reports, total time 4176.50s ± 2000.30s 2022-11-23T02:48:45.0187054Z * ce856cee7e 12 reports, total time 3997.88s ± 1945.32s 2022-11-23T02:48:45.0187490Z * 391b593ca2 13 reports, total time 4190.46s ± 2006.12s 2022-11-23T02:48:45.0187961Z * 5bba783d21 13 reports, total time 4212.68s ± 2027.67s 2022-11-23T02:48:45.0188415Z * ea920a1115 13 reports, total time 4210.45s ± 1980.15s 2022-11-23T02:48:45.0188850Z * 74e62a1fef 13 reports, total time 4199.06s ± 2006.95s 2022-11-23T02:48:45.0189268Z * 00b7d8ef23 13 reports, total time 4196.42s ± 2038.18s 2022-11-23T02:48:45.0189551Z | 2022-11-23T02:48:45.0189764Z : 2022-11-23T02:48:45.0189900Z 2022-11-23T02:48:45.0190073Z Removed (across 924 suites) 0 tests, totaling 0.00s 2022-11-23T02:48:45.0190414Z Modified (across 0 suites) 0 tests, totaling 0.00s 2022-11-23T02:48:45.0190772Z Added (across 58 suites) 752 tests, totaling +3509.08s 2022-11-23T02:48:45.0790138Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2022-11-23T02:48:45.0790476Z with: 2022-11-23T02:48:45.0790692Z env: 2022-11-23T02:48:45.0790938Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:45.0791191Z GPU_FLAG: --gpus all 2022-11-23T02:48:45.0791570Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:45.0792068Z ##[endgroup] 2022-11-23T02:48:45.0809502Z ##[group]Run set -eou pipefail 2022-11-23T02:48:45.0809814Z set -eou pipefail 2022-11-23T02:48:45.0810070Z  2022-11-23T02:48:45.0810398Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2022-11-23T02:48:45.0810729Z for _ in $(seq 1440); do 2022-11-23T02:48:45.0811036Z  # Break if no ssh session exists anymore 2022-11-23T02:48:45.0811342Z  if [ "$(who)" = "" ]; then 2022-11-23T02:48:45.0811582Z  break 2022-11-23T02:48:45.0811862Z  fi 2022-11-23T02:48:45.0812102Z  echo "." 2022-11-23T02:48:45.0812332Z  sleep 5 2022-11-23T02:48:45.0812568Z done 2022-11-23T02:48:45.0825538Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:45.0825825Z env: 2022-11-23T02:48:45.0826070Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:45.0826350Z GPU_FLAG: --gpus all 2022-11-23T02:48:45.0826717Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:45.0827077Z ##[endgroup] 2022-11-23T02:48:45.0856878Z Holding runner for 2 hours until all ssh sessions have logged out 2022-11-23T02:48:45.0905202Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2022-11-23T02:48:45.0905639Z # ignore expansion of "docker ps -q" since it could be empty 2022-11-23T02:48:45.0905988Z # shellcheck disable=SC2046 2022-11-23T02:48:45.0906303Z docker stop $(docker ps -q) || true 2022-11-23T02:48:45.0906614Z # Prune all of the docker images 2022-11-23T02:48:45.0906918Z docker system prune -af 2022-11-23T02:48:45.0919315Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:48:45.0919624Z env: 2022-11-23T02:48:45.0919876Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:48:45.0920135Z GPU_FLAG: --gpus all 2022-11-23T02:48:45.0920532Z DOCKER_CONTAINER_ID: d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:45.0920898Z ##[endgroup] 2022-11-23T02:48:45.7027259Z d8f8c46cdf70 2022-11-23T02:48:46.7952367Z Deleted Containers: 2022-11-23T02:48:46.7953058Z d8f8c46cdf70d83ca8e7165073cff6ef4ae598c50a0ada16b7b9428c2c882107 2022-11-23T02:48:46.7953456Z 2022-11-23T02:48:51.8909852Z Deleted Images: 2022-11-23T02:48:51.8910755Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T02:48:51.8911777Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7@sha256:a44ece0129de4f14f08fcb1423a34c97f2f88e2969bfc5ad6f33b15b4dfcfea3 2022-11-23T02:48:51.8912438Z deleted: sha256:ca0e26c7bac33dbcc7a8dbe19f76edee4b91691e765d908dd755388b14acb91e 2022-11-23T02:48:51.8912917Z deleted: sha256:acbff221174f85354b2aba183449be48b8a649044c4d855f47ffe7c15f2f0593 2022-11-23T02:48:51.8913365Z deleted: sha256:6e2b2e86bf8d35dfec9e9487ce66c8ec4c5e366546510dbf2061107259075ec2 2022-11-23T02:48:51.8913820Z deleted: sha256:7ee6329a87fae79992a26743847bbc86bacc2ca03801efcdce7754a05ccb441f 2022-11-23T02:48:51.8914254Z deleted: sha256:6b5c166d5b90baf854740fd230f7d138850141e8dd1069c75900aa09e5e0dc90 2022-11-23T02:48:51.8914686Z deleted: sha256:ab05e7d25d0b30fc6f3cbf720ae9408ac11bcbdfea59731ae72dcc8aa3e520d6 2022-11-23T02:48:51.8915434Z deleted: sha256:f6246cb9656221ce7e6751e93b05e13424fb2583c6137d4edbfef96ceb0fbf4f 2022-11-23T02:48:51.8915892Z deleted: sha256:04d30f9a047eb1f0199adf2b2ca2512f9554b7caad28a2fb14a22a388d305a98 2022-11-23T02:48:51.8916553Z deleted: sha256:5c6550a005fd1407aa8a10de1057e8212385b7ef81fdb403eac9cb25b2cdd30a 2022-11-23T02:48:51.8917005Z deleted: sha256:2db874a48ccf9cf10f3600e1e7e0c5165c462d689f394d7be669d852c0f28f8e 2022-11-23T02:48:51.8917434Z deleted: sha256:1cb9632811ae406a7896c8ad49717c265bba77d94df3a67c23bd0cc1222c93fe 2022-11-23T02:48:51.8917988Z deleted: sha256:0b6a063f7a058d0226b7f958b16f7f2e4b68f39e6275890c7a7a0d339ebf9a70 2022-11-23T02:48:51.8918402Z deleted: sha256:afef4e32922b9a5b37e91ef12e33891a7d22f056ccb366412c89d8b775231c67 2022-11-23T02:48:51.8918847Z deleted: sha256:13dcda7fb091c2ea82bde2883af19edb94358834314a6d7d89de972fddf854cc 2022-11-23T02:48:51.8919313Z deleted: sha256:ba4023bbeb7ba3bdfc8353b5e973c00f1dae850fe0e4232344bca0d7bdc5ed21 2022-11-23T02:48:51.8919756Z deleted: sha256:eda87a7e3963049f09510a9664306ce96a1e843bd2ed9306aad33978f2415934 2022-11-23T02:48:51.8920168Z deleted: sha256:ad917be56ac1ba848d966f7332b79ef401b9420b7952fba45570faf98875de2c 2022-11-23T02:48:51.8920606Z deleted: sha256:3c2ba68cc2c00d49222172449200284f1caf0e9b25d7eeba6698297ef42c7059 2022-11-23T02:48:51.8921046Z deleted: sha256:1505992ac44d07e9564edf19e4ed03a31f7609cbfd0cc4bd607a988bd64d6592 2022-11-23T02:48:51.8921463Z deleted: sha256:7e3ce05cb4796e4f977d34b2c9499faabd362b94750f42ea36023785fd07210b 2022-11-23T02:48:51.8921938Z deleted: sha256:99012b24a24f7cbe1787c76d263d7cf9556d2542c33e1f4fde47d5a9bfced7bf 2022-11-23T02:48:51.8922374Z deleted: sha256:44a93e00ff78481f1922fd26545eba58d27c25ede11210755a18ac1089d34491 2022-11-23T02:48:51.8922790Z deleted: sha256:3d35c57c87aa053605d9839c481d2505ef05d74269206229cd53cf56291a84bc 2022-11-23T02:48:51.8923201Z deleted: sha256:d0644c40eee52273b45cfd6a9d443606fd3b1d570bba60df414bef688531865f 2022-11-23T02:48:51.8923652Z deleted: sha256:87f1587cf0fc998c2fd781ff08bd41ecd3f1dce58e89485eae3f97cf5e65a224 2022-11-23T02:48:51.8924104Z deleted: sha256:e967cbd5936283cfefeee219340c1dc42b04c590734c01da262f3591426d7033 2022-11-23T02:48:51.8924547Z deleted: sha256:cb4e3fb37d95ff54c5467335caa5a165d6951a524c72098c8aa50e91489a1cf7 2022-11-23T02:48:51.8924969Z deleted: sha256:c999a26e008b2a5a646e886f297ad5b2efd1ffc6de561d309eecf7e904d80411 2022-11-23T02:48:51.8925415Z deleted: sha256:0f7c27db48adc87df4d2d2275a157eb1b46971b7b9901cf97c7e5f95a0bd9f3a 2022-11-23T02:48:51.8925860Z deleted: sha256:6172bd74910d9d7ff816aac3ed295122ccc70f926d75b17196d3a8a7da581a48 2022-11-23T02:48:51.8926298Z deleted: sha256:f6c04aacc51d4155e8c2e978c915ad2c1a0d170bf3e933aeeaa66f7f5e9f529b 2022-11-23T02:48:51.8926751Z deleted: sha256:1a123d1e783cb340d39fb0fd6af62f0f31ec366902386b4843d78aa15341ca35 2022-11-23T02:48:51.8927181Z deleted: sha256:14682bd215f05536bf4a50be70610b9092a6f0c799b9adfa102ff191b669013f 2022-11-23T02:48:51.8927599Z deleted: sha256:055f5d43f9fee2ec66f09539d8c1624084e8029d0e74981a53354b7ff44c0af1 2022-11-23T02:48:51.8928017Z deleted: sha256:e1d86ed5bd48d6e66387d31531db825bf43e642ec87d3aea0ae602bda9362204 2022-11-23T02:48:51.8928455Z deleted: sha256:717c3163e25327a177ddf78de4dcdea20521942a9cb18c8e1b21e8336708455b 2022-11-23T02:48:51.8928901Z deleted: sha256:eb3a046c873cc3023f10cc7a3da759dda1fc3977511c26c8fe12d02cd35d9d3b 2022-11-23T02:48:51.8929329Z deleted: sha256:f7d654b861e155438fb4017f993e78f94e4ab971ba70a9923b9578da6f521079 2022-11-23T02:48:51.8929753Z deleted: sha256:9f6e64fe4d84e561aa5f4fb6ca895b16073451674b2e9234678faa71885b7c1a 2022-11-23T02:48:51.8930197Z deleted: sha256:cd4ed903887c2663e7ae16a705cdb277bfac87a35e8b3930e0f7c28f392243ff 2022-11-23T02:48:51.8930644Z deleted: sha256:6acd45a2d76545be71b8be690cb8bf0530cd1322d5aeb90ccd0025b76aeeb9be 2022-11-23T02:48:51.8931066Z deleted: sha256:5f6fad27350d1485f97817591e1e38f341c4f2466973c5afef012cde54640119 2022-11-23T02:48:51.8931486Z deleted: sha256:e77f111f1f3e2f93994c6b428a64619af7dad3012f71096de895d7965c6f6f54 2022-11-23T02:48:51.8931927Z deleted: sha256:1f60aaab75e1b24ad8a6dda4c7591a4cbe1c0233111faa95bbb121e71744f5e2 2022-11-23T02:48:51.8932360Z deleted: sha256:ed3557da864e7a48d82ca4fd51c078d177841f99d52a8480a8442fc6da95b3ef 2022-11-23T02:48:51.8932900Z deleted: sha256:e1cb5ac03c90992794bbcece5ef3a88b8d3dbfebb070b7e7b7dc6e1066a0bd0b 2022-11-23T02:48:51.8933355Z deleted: sha256:92b610b902f3255c897a349782a959cd391014cb7949989ae136548de42631f8 2022-11-23T02:48:51.8933767Z deleted: sha256:c52692915b570d66df58947dd8d7cba68fcfcf8fc296a447481578eb25879692 2022-11-23T02:48:51.8934185Z deleted: sha256:8d99114ec4c3615c49f00fa0fa40e0db8d9445f641c42ffdc127590c6fa9fe9a 2022-11-23T02:48:51.8934689Z deleted: sha256:b3b2969d5ff622447527c54efa6f9d15f8bc70de5db17f46ae11efd709999fbd 2022-11-23T02:48:51.8935132Z deleted: sha256:bfb13edd0300dc2515d887adb670832f4a85848e65db3348bbb1fd74e90fc5fc 2022-11-23T02:48:51.8935572Z deleted: sha256:c1ca91bea3c8e97f0d48fbfe60f987231f95874fcf29a94b76bb67dca54cad18 2022-11-23T02:48:51.8936014Z deleted: sha256:b5ef1223e23564cc06af079be55d6a3962040fb669c5d8724912ff6ea2075e59 2022-11-23T02:48:51.8936441Z deleted: sha256:69f57fbceb1b420d7e4697e0f6514887b0805ee0059bea7d51e0a832962e74bf 2022-11-23T02:48:51.8936688Z 2022-11-23T02:48:51.8936835Z Total reclaimed space: 20.08GB 2022-11-23T02:48:51.8990470Z Post job cleanup. 2022-11-23T02:48:51.9029449Z Post job cleanup. 2022-11-23T02:48:52.0395973Z [command]/usr/bin/git version 2022-11-23T02:48:52.0451029Z git version 2.37.1 2022-11-23T02:48:52.0514001Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/41b1a060-b9fe-494d-9264-250c75c51e3c' before making global git config changes 2022-11-23T02:48:52.0514560Z Adding repository directory to the temporary git global config as a safe directory 2022-11-23T02:48:52.0520965Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T02:48:52.0563727Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-11-23T02:48:52.0600897Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-11-23T02:48:52.0926183Z Entering 'android/libs/fbjni' 2022-11-23T02:48:52.0969315Z Entering 'third_party/FP16' 2022-11-23T02:48:52.1010981Z Entering 'third_party/FXdiv' 2022-11-23T02:48:52.1051635Z Entering 'third_party/NNPACK' 2022-11-23T02:48:52.1093901Z Entering 'third_party/QNNPACK' 2022-11-23T02:48:52.1138202Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T02:48:52.1180642Z Entering 'third_party/XNNPACK' 2022-11-23T02:48:52.1236123Z Entering 'third_party/benchmark' 2022-11-23T02:48:52.1299727Z Entering 'third_party/cpuinfo' 2022-11-23T02:48:52.1319549Z Entering 'third_party/cub' 2022-11-23T02:48:52.1362774Z Entering 'third_party/cudnn_frontend' 2022-11-23T02:48:52.1410395Z Entering 'third_party/cutlass' 2022-11-23T02:48:52.1460349Z Entering 'third_party/eigen' 2022-11-23T02:48:52.1505572Z Entering 'third_party/fbgemm' 2022-11-23T02:48:52.1547848Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T02:48:52.1588751Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T02:48:52.1631385Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T02:48:52.1674347Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T02:48:52.1717800Z Entering 'third_party/flatbuffers' 2022-11-23T02:48:52.1762450Z Entering 'third_party/fmt' 2022-11-23T02:48:52.1804659Z Entering 'third_party/foxi' 2022-11-23T02:48:52.1846446Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T02:48:52.1888567Z Entering 'third_party/gloo' 2022-11-23T02:48:52.1931110Z Entering 'third_party/googletest' 2022-11-23T02:48:52.1973648Z Entering 'third_party/ideep' 2022-11-23T02:48:52.2014723Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T02:48:52.2058938Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T02:48:52.2107792Z Entering 'third_party/ios-cmake' 2022-11-23T02:48:52.2149813Z Entering 'third_party/ittapi' 2022-11-23T02:48:52.2191045Z Entering 'third_party/kineto' 2022-11-23T02:48:52.2232821Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T02:48:52.2273637Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T02:48:52.2317529Z Entering 'third_party/nccl/nccl' 2022-11-23T02:48:52.2360996Z Entering 'third_party/neon2sse' 2022-11-23T02:48:52.2402676Z Entering 'third_party/nlohmann' 2022-11-23T02:48:52.2445702Z Entering 'third_party/onnx' 2022-11-23T02:48:52.2501953Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T02:48:52.2544675Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T02:48:52.2588684Z Entering 'third_party/onnx-tensorrt' 2022-11-23T02:48:52.2629604Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T02:48:52.2676076Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T02:48:52.2718263Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T02:48:52.2759331Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T02:48:52.2806315Z Entering 'third_party/pocketfft' 2022-11-23T02:48:52.2847620Z Entering 'third_party/protobuf' 2022-11-23T02:48:52.2894722Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T02:48:52.2937517Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T02:48:52.2984225Z Entering 'third_party/psimd' 2022-11-23T02:48:52.3028299Z Entering 'third_party/pthreadpool' 2022-11-23T02:48:52.3071242Z Entering 'third_party/pybind11' 2022-11-23T02:48:52.3112940Z Entering 'third_party/python-enum' 2022-11-23T02:48:52.3155517Z Entering 'third_party/python-peachpy' 2022-11-23T02:48:52.3198099Z Entering 'third_party/python-six' 2022-11-23T02:48:52.3240347Z Entering 'third_party/sleef' 2022-11-23T02:48:52.3282569Z Entering 'third_party/tbb' 2022-11-23T02:48:52.3326107Z Entering 'third_party/tensorpipe' 2022-11-23T02:48:52.3368145Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T02:48:52.3414760Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T02:48:52.3456760Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T02:48:52.3498655Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T02:48:52.3539483Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T02:48:52.3584511Z Entering 'third_party/zstd' 2022-11-23T02:48:52.3644312Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-11-23T02:48:52.3675757Z http.https://github.com/.extraheader 2022-11-23T02:48:52.3687070Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2022-11-23T02:48:52.3724477Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-11-23T02:48:52.4039566Z Entering 'android/libs/fbjni' 2022-11-23T02:48:52.4063404Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4097256Z Entering 'third_party/FP16' 2022-11-23T02:48:52.4121921Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4153272Z Entering 'third_party/FXdiv' 2022-11-23T02:48:52.4178003Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4210078Z Entering 'third_party/NNPACK' 2022-11-23T02:48:52.4234776Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4268179Z Entering 'third_party/QNNPACK' 2022-11-23T02:48:52.4292680Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4325325Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T02:48:52.4349622Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4382814Z Entering 'third_party/XNNPACK' 2022-11-23T02:48:52.4407262Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4451946Z Entering 'third_party/benchmark' 2022-11-23T02:48:52.4476734Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4509021Z Entering 'third_party/cpuinfo' 2022-11-23T02:48:52.4533912Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4566014Z Entering 'third_party/cub' 2022-11-23T02:48:52.4590067Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4621906Z Entering 'third_party/cudnn_frontend' 2022-11-23T02:48:52.4646222Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4684324Z Entering 'third_party/cutlass' 2022-11-23T02:48:52.4709086Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4750799Z Entering 'third_party/eigen' 2022-11-23T02:48:52.4776378Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4812418Z Entering 'third_party/fbgemm' 2022-11-23T02:48:52.4837055Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4869136Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T02:48:52.4894506Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4926734Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T02:48:52.4950930Z http.https://github.com/.extraheader 2022-11-23T02:48:52.4983908Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T02:48:52.5009851Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5044074Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T02:48:52.5068039Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5102036Z Entering 'third_party/flatbuffers' 2022-11-23T02:48:52.5126527Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5160377Z Entering 'third_party/fmt' 2022-11-23T02:48:52.5184205Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5216673Z Entering 'third_party/foxi' 2022-11-23T02:48:52.5241424Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5273084Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T02:48:52.5298054Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5330008Z Entering 'third_party/gloo' 2022-11-23T02:48:52.5353951Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5386837Z Entering 'third_party/googletest' 2022-11-23T02:48:52.5412370Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5445004Z Entering 'third_party/ideep' 2022-11-23T02:48:52.5469340Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5501215Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T02:48:52.5526583Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5561154Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T02:48:52.5585414Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5625597Z Entering 'third_party/ios-cmake' 2022-11-23T02:48:52.5650357Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5682696Z Entering 'third_party/ittapi' 2022-11-23T02:48:52.5706673Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5738476Z Entering 'third_party/kineto' 2022-11-23T02:48:52.5763302Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5795200Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T02:48:52.5819716Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5852664Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T02:48:52.5877286Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5911361Z Entering 'third_party/nccl/nccl' 2022-11-23T02:48:52.5937138Z http.https://github.com/.extraheader 2022-11-23T02:48:52.5969427Z Entering 'third_party/neon2sse' 2022-11-23T02:48:52.5993444Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6025113Z Entering 'third_party/nlohmann' 2022-11-23T02:48:52.6049693Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6084738Z Entering 'third_party/onnx' 2022-11-23T02:48:52.6109088Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6155171Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T02:48:52.6179612Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6211769Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T02:48:52.6235462Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6270205Z Entering 'third_party/onnx-tensorrt' 2022-11-23T02:48:52.6296001Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6327214Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T02:48:52.6350669Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6388045Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T02:48:52.6413032Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6446279Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T02:48:52.6470905Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6503926Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T02:48:52.6528238Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6566163Z Entering 'third_party/pocketfft' 2022-11-23T02:48:52.6590068Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6622388Z Entering 'third_party/protobuf' 2022-11-23T02:48:52.6647896Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6684670Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T02:48:52.6708168Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6740570Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T02:48:52.6765012Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6799504Z Entering 'third_party/psimd' 2022-11-23T02:48:52.6823490Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6855837Z Entering 'third_party/pthreadpool' 2022-11-23T02:48:52.6880600Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6912120Z Entering 'third_party/pybind11' 2022-11-23T02:48:52.6936908Z http.https://github.com/.extraheader 2022-11-23T02:48:52.6970302Z Entering 'third_party/python-enum' 2022-11-23T02:48:52.6994694Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7026451Z Entering 'third_party/python-peachpy' 2022-11-23T02:48:52.7050911Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7082863Z Entering 'third_party/python-six' 2022-11-23T02:48:52.7107237Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7139153Z Entering 'third_party/sleef' 2022-11-23T02:48:52.7163782Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7195452Z Entering 'third_party/tbb' 2022-11-23T02:48:52.7219975Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7254184Z Entering 'third_party/tensorpipe' 2022-11-23T02:48:52.7279737Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7312665Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T02:48:52.7336937Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7369137Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T02:48:52.7393375Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7427409Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T02:48:52.7451694Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7484842Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T02:48:52.7508846Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7540834Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T02:48:52.7566164Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7602954Z Entering 'third_party/zstd' 2022-11-23T02:48:52.7627221Z http.https://github.com/.extraheader 2022-11-23T02:48:52.7929656Z Cleaning up orphan processes